this post was submitted on 23 Aug 2023
5 points (85.7% liked)

Data Engineering

185 readers
1 users here now

Discussion on Data Engineering topics. Data pipelines, tools and technologies, databases and DBMS, best practices:

Rules:

founded 1 year ago
MODERATORS
 

Our data engineer insists in lowercasing everything and removing some other formatting like new lines on free text fields.

They say it's "better for elastic search".

To me that makes no sense and loses information that can't be added back. But I couldn't really convince them otherwise. So far no real problem has come out of it but it makes for a worse experience for the user. Like company names that are acronyms show up as all lowercase. (ibm, llc, etc.) or free text fields that we miss when the user wrote in caps or added paragraphs.

What are your thoughts on this?

Disclaimer, I'm not a data engineer. Just a PM from a data related product.

you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 1 points 1 year ago

If space is not an issue, you can keep both versions, one for display, one for search in your db. That way, you don’t need to figure out how to reformat it later.

Side note: But there is an underlying issue which is your data engineer and you don’t communicate technological needs well. It’s a common challenge, so no judgment/condescension meant from me. Consider taking short courses on the technologies your team uses, so you can get better information and context from your meetings with them. I recognize that expecting you to organize that instead of your boss isn’t fair, but I hope it helps you avoid future friction and stress.