What exactly is SQL? A simple summary
If you are looking to start out a career in the world of data and analytics, or just want to up your skills in data analysis in general, one of the most common terms you will come across is SQL.
There are a whole lot of different concepts and tools in this field from spreadsheets to dashboards to a range of coding languages such as Python and R, but arguably the single most important tool is SQL.
But what exactly is SQL? And why is it so important?
SQL is the foundation language of data
SQL (pronounced ‘sequel’) stands for Structured Query Language, and in basic terms is a coding language used to access data stored in a database (more on databases in another article).
For anyone familiar with spreadsheets – think of a single table in SQL as one tab on a spreadsheet. If you are looking to access and extract a set of data from this table you will use a SQL query. In this case a query simply refers to lines of code.
It’s worth noting that SQL in general forms the backbone of very sophisticated systems such as websites, but this article will be speaking in the context of data analysis.
Querying with SQL is very easy
If you aren’t familiar with any coding languages, referring to SQL queries as code may seem intimidating, but in reality the concepts are quite simple compared to more traditional coding done in languages such as Python or Javascript.
The general structure of a basic SQL query generally follows these rules:
- First you SELECT which columns you would like to extract. This is particularly useful if the table has hundreds of columns but you only need to work with a few
- Then you let the system know where you are accessing this FROM. In this case you enter the name of the table and the database name that it is sitting within (usually easily viewable in the UI of whichever tool you are using)
- Finally, as an optional third step you can set a filter WHERE the data needs to fit a specific criteria
Let’s say you have a data set containing a list of every country and their primary spoken language, and wanted to display a list of every country that speaks Spanish. The SQL query that you would type to get this data is quite simple. In fact the highlighted words above are part of the code!
When writing this out as a real query it would look something like the below –
SELECT Country, Language
FROM ListOfCountriesDatabase
WHERE Language = 'Spanish'
This would result in the following output:
Country | Language |
---|---|
Argentina | Spanish |
Bolivia | Spanish |
Chile | Spanish |
Colombia | Spanish |
Costa Rica | Spanish |
Cuba | Spanish |
Dominican Republic | Spanish |
Ecuador | Spanish |
El Salvador | Spanish |
Equatorial Guinea | Spanish |
Guatemala | Spanish |
Honduras | Spanish |
Mexico | Spanish |
Nicaragua | Spanish |
Panama | Spanish |
Paraguay | Spanish |
Peru | Spanish |
Spain | Spanish |
Uruguay | Spanish |
Venezuela | Spanish |
When would SQL be used?
Now that we have covered what SQL means, and how it is used, lets discuss why we would want to use SQL to begin with.
SQL isn’t the only way computers are capable of handing data – after all Excel is used by just about every business out there! For a more detailed analysis of why people using Excel day to day should consider learning SQL read the following article:
Why Excel users should learn SQL
At its core, working with SQL and databases allows for both structure and scale. There is only so much that can be done with separate data files such as CSVs and Excel docs.
Let’s say you are running an online business and want to easily keep track of all sales. The easiest way to do this is by storing every single transaction within a database. SQL then makes it very easy to look up specific information.
Want to know what your top 3 best selling products were in the month of August? Simply enter this query:
SELECT Product_Name, Category, Total_Sales
FROM TransactionDatabase
ORDER BY Total_Sales
LIMIT 3;
The below output would then display:
Product_Name | Category | Total_Sales |
---|---|---|
Red Jacket | Clothing | 45 |
Black Parka | Clothing | 33 |
Leather Watch | Accessories | 20 |
What can we do with this data?
Running basic queries for smaller sets of numbers is really easy as we have seen, but what if we have more ongoing and complex requests? Running multiple queries a day every day to keep track of things can be time consuming.
Luckily for us, SQL is such a widely used language it is compatible with a range of systems! This is where storing everything within SQL databases really shines.
Some ways to extract and work with SQL queries are –
- Connect Excel directly to SQL in the form of a pivot table. This one is great if your standard output tends to be Excel spreadsheets. Simply type the SQL query into the connection and just refresh the pivot table when you need updated numbers!
- Work with data visualization platforms such as Tableau and Power BI.
- Use coding languages such as Python to extract data – the use cases here are nearly unlimited
Data visualization is a whole topic in itself, but lets look at a quick example using the above online retailer dataset again.
Rather than running a query manually to output a set of numbers each time, we can connect that same database to a visualization platform (Tableau is being used in this example), to create a basic chart containing the same information. Due to the compatibility of the language, the same SQL query has been entered into the connection, so no need to learn a second language. See below example:
Connecting to additional platforms like this saves everyone so much time as in a lot of cases these charts are automatically updated.
To summarise, SQL is simply a method for accessing and transforming data. This data is stored in databases and tables that can be extracted using a simple query (lines of code) specifying the scale and requirements to be outputted. SQL is so widely used and compatible with so many different platforms (and even other coding languages) that it is far and away the best way of working with data.
Of course, this article doesn’t even begin to scratch the surface of what SQL is capable of. Transforming and manipulating data is a whole series of articles in itself, but hopefully this has helped to provide a bit of context and understanding of what SQL actually is and why it would be used.