How good at SQL does a data scientist really need to be?

Question

Abhi · Answer

SQL is a standardized&#160;query language&#160;for requesting information from a&#160;database.&#160;[1]On a scale of&#160;1&#8211;10&#160;where&#160;1&#160;only knows&#160;select * from table&#160;and a&#160;10&#160;can&#160;fluently build stored procedures and views, a data scientist should be&#160;at least&#160;a&#160;7.Why?SQL is&#160;THE&#160;language for working through a database environment. It&#8217;s not the language to perform &#8220;science&#8221; on the data, but it is the language to pull and manipulate the data. A DATA scientist needs to be fluent in DATA. Being fluent in data means that they should have a proper understanding of the final stage of data governance.Data governance is the capability that enables an organization to ensure that high data quality exists throughout the complete lifecycle of the data.[2]&#160;The final stage of data governance:&#160;querying the data.If a data scientist fully relied on a data engineer or an ETL developer to get all of the data they needed, they would have a tough time finding an employer who wants them.Are you going to develop a statistical approach on a table that contains&#160;2 billionrows? What&#8217;s your plan? Store all of that in R or Python memory? Come on&#8230;All things aside, SQL is an&#160;easy&#160;language to learn. It honestly mirrors the English language.A data scientist, who is typically expected to be fluent in one of R, Python or SAS,&#160;could&#160;and&#160;should&#160;be able to learn and be proficient in SQL in a relatively short amount of time

How good at SQL does a data scientist really need to be

Your comment on this question:

1 answer to this question.

Your answer

Your comment on this answer:

Related Questions In Data Analytics

How to use a function to repeat a set of procedures on specific set of columns in a data frame?

How to convert a list of vectors with various length into a Data.Frame?

How to create a list of Data frames?

How to spilt a column of a data frame into multiple columns

What is difference between Distributed search head and Search head cluster?

"Train" and "Test" sets in Data Science

Installing MXNet for R in Windows System

Problem with installation of Wordcloud in anaconda

What will be first step to be a data scientist?

How does data cleaning play a vital role in data analysis

Subscribe to our Newsletter, and get personalized recommendations.

TRENDING CERTIFICATION COURSES

TRENDING MASTERS COURSES

COMPANY

WORK WITH US

DOWNLOAD APP

CATEGORIES

CATEGORIES

TRENDING BLOG ARTICLES

TRENDING BLOG ARTICLES