
What You’ll Learn
Excel and external data are more connected than most users realise, and if you have never set up a live data connection before, you are about to save yourself a serious amount of time every single week.
Here is a scene you might recognise. It is Monday morning. You need to update the weekly report. So you open your browser, export a CSV from the CRM, copy a table from a supplier’s website, wait for IT to send you the database extract they promised by nine, and then spend the next forty minutes cleaning up inconsistent formatting before you can do any actual analysis. By the time the data is ready, half the morning is gone.
This lesson is the better way. You are going to learn how to connect Excel directly to websites, SQL databases, and live APIs — clean and reshape data before it even touches your spreadsheet — and set up automatic refresh so everything stays current without you having to babysit it. This is an advanced lesson. If you have not used Power Query before, start with Lesson 20 on Power Query Basics and come back here once you are comfortable with the foundations.
The copy-paste workflow feels fine when you are doing it once. The problem is it is never once. It is every Monday. Every month-end. Every time someone sends a revised file. And every manual step you add is another place where something can quietly go wrong.
You grab the wrong tab. You paste into the wrong row. You miss the last three rows because they were cut off below the fold. The numbers look right, but they are not — and you do not find out until someone spots a discrepancy in a meeting.
Here is the thing: this is not a skill problem. It is a process problem. The copy-paste approach is fundamentally unreliable at scale because humans are not machines. Once you connect directly to a source, the data comes in the same way every time — through the same steps, in the same format. Repeatable by design.
Power Query is the engine underneath every external data connection in Excel. Think of it as the middleman between your data source and your spreadsheet. It picks up the raw data, does whatever cleaning and reshaping you have told it to do, and hands the finished result to your sheet.
When you use Get Data from any source, you are using Power Query. The steps you build in the Power Query Editor get saved as a recipe. Every time you refresh, Excel runs that same recipe again on the latest data. You set it up once and it just works.
Pulling a table from a website is one of the most common Excel and external data tasks. Here is exactly how to do it.
Steps: Import Data From a Website
💡 PRO TIP
If you keep getting a blank Navigator or no tables detected, the page is almost certainly loading its data with JavaScript rather than static HTML. Excel can only read what is already in the page source. If the data loads after the page does, Excel cannot see it. Look for an API or a direct export option from the source system instead.
Advertisement-X
Do not just grab the first table in the Navigator list. Websites use HTML table formatting for layout elements all the time, not just for data. You might see five or six results labelled Table 0, Table 1, Document, and so on. Click through each one and check the preview on the right until you find the actual data you are after.
The From Web connector is useful but it has real limits you need to know about before you rely on it for anything critical.
⚠️ COMMON MISTAKE
Assuming your web connection will keep refreshing just because it worked the first time. If the page structure changes even slightly, the table name in the Navigator may change too and your query will break on the next refresh. Always test the refresh after any major website update before you rely on that connection in a live report.
Steps: Connect Excel to SQL Server
Windows authentication uses the credentials you are already logged in with on your computer. In a corporate environment on the same network as the server, this usually just works — no extra passwords required.
SQL Server authentication uses a separate username and password specific to the database. You will need these from whoever manages the server. One thing to be careful about: if you save these credentials in the connection and share the file, anyone who opens it may be able to use that connection. Think before you share.
Trust me on this: never load an entire database table into Excel if you can avoid it. A production table with two million rows will either crash the import, hit Excel’s row limit of 1,048,576, or turn your file into a slow, bloated mess. Use the SQL Statement field under Advanced Options to paste a targeted query. Pull only the columns you need, add a WHERE clause to limit the date range or filter by status, and do the heavy lifting at the database level.
Example SQL Query — Targeted Import
SELECT OrderID, CustomerName, OrderDate, TotalValue FROM dbo.Orders WHERE OrderDate >= '2024-01-01' AND Status = 'Completed'
Paste this into Data → Get Data → From SQL Server → Advanced Options → SQL Statement
⚠️ COMMON MISTAKE
Loading a full database table when a filtered query would do the job. If the connection was built without a WHERE clause, check it before you inherit the performance problem. A table with two million rows will silently truncate at Excel’s row limit of 1,048,576 — and you will not always get a warning that rows were cut off.
| Error Message | Likely Cause | Fix |
|---|---|---|
| Data source name not found | Wrong server name or no network access | Check server name; connect VPN if remote |
| Login failed | Wrong credentials or insufficient permissions | Verify username/password; ask IT to check permissions |
| Query timeout | Query returning too much data or server under load | Add WHERE clause to narrow results; try off-peak |
You have probably seen JSON if you have ever looked at data from a REST API. It looks like a wall of curly braces, square brackets, and key-value pairs — and if you have never worked with it before, it feels impenetrable.
Here is the thing: almost every modern data source uses JSON. Financial APIs, weather data, logistics tracking, e-commerce platforms, CRM systems with API access — all of it comes back as JSON by default. If you can bring JSON into Excel and flatten it into columns and rows, you unlock a huge range of data sources that most Excel users never touch.
Steps: Import a Local JSON File
Steps: Connect to a Live API (Excel Live Data Feed)
💡 PRO TIP
API responses are often nested three or four levels deep before you reach the actual values. Every time you see Record or List in a column, that is another layer to expand. Keep going until you hit plain text or numbers. Power Query saves every expansion step, so the entire process runs automatically on every refresh — you only build it once.
Manual refresh is the default. You right-click the table and click Refresh, or go to Data and click Refresh All. Everything updates on demand — you stay in control of when data is pulled.
Refresh on open is one step further. Go to Data → Queries and Connections, right-click your query, select Properties, and under the Usage tab tick Refresh data when opening the file. Now every time someone opens the file, Excel pulls fresh data automatically before they see anything. No manual step required.
In the same Query Properties dialog, tick Refresh every and set a number of minutes. While the file is open, Excel refreshes on that schedule automatically — useful for dashboards that need near-live data, like a sales board updating every fifteen minutes.
Important caveat: this only runs while the file is open on your computer. If you need data to refresh on a schedule without someone keeping the file open, you need Excel Online with a data gateway, Power BI, or a scripted scheduled task. Plain Excel desktop cannot refresh in the background while closed.
Connections break. Servers go down, URLs change, API keys expire. When Excel cannot refresh, it throws an error in the Queries and Connections panel and your data stays frozen at the last successful refresh — it does not disappear, but it is stale, and it might not look stale. For any report that uses external data, build in a last-refreshed timestamp so you and your colleagues always know how current the data actually is.
Go to Data → Queries and Connections. This is the control panel for every connection in your workbook. Right-click any connection to edit its properties, trigger a manual refresh, or open it in the Power Query Editor to update transformation steps. If a data source changes — a server gets renamed, a URL is updated — you fix it here rather than rebuilding from scratch.
If you delete a connection from the panel, the data already loaded into your sheet stays put — it just becomes static. This is occasionally useful when you want to lock in a snapshot of data at a specific point in time without keeping a live connection running.
When you share a file with external data connections, the person receiving it needs to be able to reach the same data source you used. A SQL connection that works on your corporate network will fail when someone opens it from home without a VPN. A web connection with authentication will prompt them for credentials they may not have.
Tell colleagues what connections the file uses before you share it. For structuring and matching the data that comes in through these connections, see Lesson 35 on VLOOKUP and Data Matching, which covers how to combine external imports with your existing sheet data.
| Source Type | Excel Path | Best For | Key Limitation |
|---|---|---|---|
| Website (HTML table) | Get Data → From Web | Static public tables | No JS-rendered content |
| SQL Server | Get Data → From Database | Corporate databases, CRMs | Needs network/VPN access |
| JSON File | Get Data → From File → From JSON | API exports, config files | Nested structure needs expanding |
| Live API (JSON) | Get Data → From Web (API URL) | Financial feeds, live tracking | Auth headers require manual setup |
Excel works well with external data up to a point. Once you are regularly pulling hundreds of thousands of rows, refresh times get long, files get heavy, and the experience degrades fast. The hard row limit is 1,048,576 rows per sheet. Exceed that and data is silently truncated — you will not always get a clear warning. Write targeted queries with WHERE clauses that limit what you bring in, and you will avoid this problem in most cases.
Excel is a fantastic analysis tool. It is not a data infrastructure tool. When you find yourself regularly pulling millions of rows, joining multiple sources before loading, building reports that need to be automatically published, or maintaining connections that must run without human intervention — you have outgrown what Excel is designed to do.
That is not a failure. That is knowing your tools. Power BI, Python with pandas, or a purpose-built ETL platform will handle that kind of work without the limitations. Knowing when to hand off is part of being genuinely good at this.
🧪 Try It Yourself
Here is a practical exercise to run through the full workflow covered in this lesson.
That is the complete loop: connect, transform, load, refresh. Once that sequence makes sense to you, every other Excel and external data connection you build follows exactly the same logic.
📚 Further Reading
→
Microsoft Docs: Import data from external data sources (Power Query)
The official reference covering all Get Data connectors, authentication options, and refresh settings in Excel.
→
Contextures: Power Query Tutorials
Detailed worked examples with screenshots covering web imports, transformations, and common query scenarios. Worth bookmarking.
Previous Lesson
Next Lesson
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut elit tellus, luctus nec ullamcorper mattis, pulvinar dapibus leo.
Advertisement-X