How to Split a Huge CSV Excel Spreadsheet Into Separate Files
Pinterest Stumbleupon Whatsapp
Advertisement

Excel is excellent at so many day-to-day tasks. But every now and then, you encounter one of its shortcomings: the size of a spreadsheet. If you need to know how to make an Excel spreadsheet smaller or how to split a large CSV, we have you covered.

Hitting the spreadsheet row limit is a pretty serious task in and of itself. The current Excel spreadsheet row limit is 1,048,576. That’s right — over 1 million rows, and the same amount of columns, too.

But hitting that row limit is actually incredibly easy, during certain tasks. If you’re marketing via email, you might import a CSV file How to Import Data Into Your Excel Spreadsheets the Neat & Easy Way How to Import Data Into Your Excel Spreadsheets the Neat & Easy Way Have you ever struggled with importing or exporting data into a spreadsheet? This tutorial will help you master the art of moving data between Microsoft Excel, CSV, HTML, and other file formats. Read More containing millions of email addresses. The only problem is how do you actually manage a spreadsheet with that many addresses in? Furthermore, what if someone sends you a CSV that is already over the limit (from another program)?

Well, I’m going to show you how to break down that enormous CSV into easily manageable chunks.

(Don’t have a large CSV handy? There are some pretty big CSV datasets available 15 Massive Online Databases You Should Know About 15 Massive Online Databases You Should Know About The Internet documents everything. Some of that data gets concentrated in massive knowledgebases. Call them online encyclopedias or databases -- the question is, do you know about the best ones? Read More over at data.gov. I’m going to use the 260,000 row Hospital Compare list, so you can use the same file.)

1. Use a Program

There are a number of useful CSV splitter programs out there. I’ll show you two of the best.

CSV Chunker

CSV Chunker is an open source CSV splitter. It can handle massive files, rapidly splitting them into chunks of your choosing. It took about 3 seconds to split our Hospital Compare CSV into 106 chunks, containing 2,500 rows each.

How to Split a Huge CSV Excel Spreadsheet Into Separate Files CSV Chunker before and after

CSV Splitter

CSV Splitter is our second tool. It offers much the same functionality of CSV Chunker, albeit with a slightly sleeker design. CSV Splitter took about 4 seconds to split our Hospital Compare CSV into 106, again containing 2,500 rows each.

How to Split a Huge CSV Excel Spreadsheet Into Separate Files csv splitter before and after

2. Use a Batch File

Next up, a programmable batch file How to Write a Simple Batch (BAT) File How to Write a Simple Batch (BAT) File Batch files can automate everyday computer tasks. We'll show you how batch files work and teach you the basics. By the end of this article, you'll have written your first batch file. Read More . We can use a batch file to process the CSV into smaller chunks, customizing the file to deliver different chunks.

Open a new Text document. Copy and paste the following:

@echo off

setlocal ENABLEDELAYEDEXPANSION

REM Edit this value to change the name of the file that needs splitting. Include the extension.

SET BFN=HCAHPSHospital.csv

REM Edit this value to change the number of lines per file.

SET LPF=2500

REM Edit this value to change the name of each short file. It will be followed by a number indicating where it is in the list.

SET SFN=HosptialSplitFile

REM Do not change beyond this line.

SET SFX=%BFN:~-3%

SET /A LineNum=0

SET /A FileNum=1

For /F "delims==" %%l in (%BFN%) Do (

SET /A LineNum+=1

echo %%l >> %SFN%!FileNum!.%SFX%

if !LineNum! EQU !LPF! (

SET /A LineNum=0

SET /A FileNum+=1

)

)

endlocal

Pause

You’ll need to configure the batch file before running. I’ll tell you what each command does, and you can alter it to suit the size of your batch file, as well as the required output.

  • “SET BFN=” should point to the CSV you need to break down
  • SET LPF=” is the number of rows you’d like to limit your new file to
  • “SET SFN=” is the new naming scheme for your split files

Once you’ve entered your variables, head to File > Save As. Choose filename, and select Save. Then, select your newly saved text file and press F2 to rename it. Replace the TXT extension with BAT, and press OK when the warning appears. Now you can split your large CSV!

I warn you though, this method does take some time. You’re probably better off using one of the aforementioned splitters.

The original script and thread can be found here.

3. Use a PowerShell Script

Batch files are handy. You can use them for a wide range of day to day tasks. But PowerShell scripts are by-and-large faster, especially for this type of processing and division.

The following script will quickly cut your large CSV into smaller files. Again, I’ve used the Hospital Compare CSV for my example. It took about 3 seconds for the command to run. Open the Windows PowerShell ISE. If you’re unsure on how to do this, type powershell ise into your Start menu search and select the corresponding option. When the PowerShell ISE is open, select the Show Script arrow, located in the top right corner (outlined in red in the below image).

How to Split a Huge CSV Excel Spreadsheet Into Separate Files powershell csv script

Copy and paste the following command into the script panel, at the top of your screen.

$i=0; Get-Content C:\Users\Gavin\Downloads\Hospital_Revised_Flatfiles\HCAHPSHospital.csv -ReadCount 2500 | %{$i++; $_ | Out-File C:\Users\Gavin\Downloads\Hospital_Revised_Flatfiles\split\splitfile_$i.csv}

First, you need to change your input file location. Second, you need to specify how many rows your output files should contain. Lastly, specify your output file location. I’ve left my file paths in there to illustrate the situation. Furthermore, your output filename must have the “_$i.csv”suffix matching mine. Otherwise, your output files won’t number properly.

When you have edited the file with your own data, press CTRL + S to save your script.

The original script can be found here.

4. Load It in a Data Model

Our final solution isn’t actually splitting the large CSV into smaller parts. You see, Jose Barreto, a member of the OneDrive team, recently illustrated exactly how you can load a large CSV file in Excel. In this case, large means over the 1,048,567 row limit. He explains in a tutorial that instead of actually loading the file, Excel 2016 can create a data link to the CSV file.

How to Split a Huge CSV Excel Spreadsheet Into Separate Files cvs rows in excel data model pivot table

The data from the original CSV will be available to the user within a Data Model. Barreto created a Pivot Table using “up to 8.5 million rows with no problem at all.” The above image is taken from Barreto’s blog post (linked below), showing a total of 2 million rows within Excel. (Learn how to use a Pivot Table for data analysis How to Use an Excel Pivot Table for Data Analysis How to Use an Excel Pivot Table for Data Analysis The pivot table is one of the single most powerful tools in the Excel 2013 repertoire. It is frequently used for large data analysis. Follow our step-by-step demonstration to learn all about it. Read More .)

If this is the solution that suits you, follow Jose Barreto’s detailed tutorial right here. Remember, this isn’t splitting the CSV, but it does mean you can manipulate your data within certain Excel tools How to Visualize Your Data Analysis with Excel's Power Tools How to Visualize Your Data Analysis with Excel's Power Tools Excel is killing it with its advanced data management features. Once you have used one of the new tools, you will want them all. Become a master of your data analysis with power tools! Read More . If accessing your data to interact with it directly, this could be worth a try.

Break It Down!

We’ve listed four solutions to breaking down your massive CSV files, including two different tools.

Of the four solutions, the batch file is probably the slowest 5 Reasons You Should Use PowerShell Instead of Batch Scripting 5 Reasons You Should Use PowerShell Instead of Batch Scripting PowerShell is what you'd get if you crossed the Command Prompt with Batch Scripting, threw in some extra features, and kicked it all up several notches. Here are several reasons you should try it. Read More . The PowerShell script is fast and easily customizable, while the two applications are simple to use out of the box. Finally, you have the option of directly importing your data into Excel, by the millions of rows.

Which method do you choose? And do you have any other tips? Please share with us in the comments!

Image Credit: lucadp/Depositphotos

Leave a Reply

Your email address will not be published. Required fields are marked *

  1. Ralph Skinner
    December 1, 2017 at 9:03 am

    Loved the PowerShell example, saved me so much time. Now to work out how to merge the files back again once they have been analysed. Stephan Solomonidis, kudos to you.

  2. shams
    December 14, 2016 at 10:55 am

    That's not true anymore:

    CSV files have no limit of rows you can add to them. Excel won't hold more that the 1 million lines of data if you import a CSV file having more lines. Excel will actually ask you whether you want to proceed when importing more than 1 million data rows.

    • ee
      March 5, 2017 at 3:45 am

      That's not the point. The chap here is trying to help us split the files easily for our own reasons. Your rant is a technicality that nobody needs to know about. Go back in the basement!

  3. Ramesh
    October 24, 2016 at 4:00 am

    Awesome Worked like a charm
    :-)

  4. Ramesh
    October 24, 2016 at 4:00 am

    Awesome Worked like a charm
    :-)

  5. Juan Alvarez
    October 4, 2016 at 12:40 am

    God bless you for this entry.

  6. dima
    April 1, 2016 at 11:32 am

    Thank u so much

  7. Anonymous
    September 3, 2015 at 8:39 am

    I’m working on 10 million records email addresses..I have to verify it and remove invalid emails, bounces, spam trap... then create 200 CSV files. Which is the best software for clean huge databases? I need more tools? I need help please.

  8. Anonymous
    September 2, 2015 at 6:05 pm

    Hi Diana, I have the same problem. I'm working on 10 million records email addresses..I have to verify it and remove invalid emails then create 200 CSV files. Which is the best software for clean huge databases? You can help me please?

  9. Diana Ward
    May 24, 2015 at 8:08 pm

    Thank you! You saved me a lot of misery, since I was just tasked with the job of editing a mailing list database of over 9 million! It wouldn't fit into any program I have.

  10. Manish
    February 21, 2015 at 12:52 pm

    Is there any for iOS (Mac)

  11. Mark Murphy
    September 13, 2009 at 10:16 am

    Great time saver!!

    Thanks :)

  12. Colm
    August 17, 2009 at 7:58 pm

    Are there legitimate uses of CSV files containing millions of mail addresses? Maybe Excel is trying to tell you something! ;)

  13. CYMR0
    August 16, 2009 at 9:38 am

    It may not be appropriate, but why not use Access? Better suited for large number records than Excel.

  14. James Rintamaki
    August 14, 2009 at 9:59 am

    I actually ran into this issue a few days ago, and I used the same tool. One thing that also helped, in not needing to have it split it up into as many separate files, is that Excel 2007 can handle 2^20 rows (1,048,576) as opposed to 2003's 2^16 (65,536).

    Also, once you have all these separate files, you can put them all in one XLS file by adding them each as separate worksheets to the same file (the row limit is per worksheet).