Consolidate many line based .CSV files in to a single .CSV with one header line and per file data lines

Summarize a huge amount of files that have line based columns and data in to a single file with the first line the headers found in all files and the actual data as per row for each file, while the headers might change throughout the source files and need to be added dynamically.

This is a special script I wrote for someone else that had about 45k files to process. It is crazy enough to be worth posting here 🙂 and can be found on Spiceworks as well.

Situation:

  • many .CSV files
  • all have the columns per line instead of in the first line
  • the data looks like
    • column,data
    • column,data
  • he needs all files transferred in to one file in this format
    • header,header,header
    • data,data,data
    • data,data,data
  • from per line to one line as a header and the data in each line per file
  • additional challenge
    • the headers might change throughout the files and add more headers

What the script does:

  1. cycle through all files
    1. detect all headers
  2. cycle a second time through all files
    1. detect all the data
    2. write the data in the right column per line per file

Flaws:

  • The script does not obey if there is data with a comma “,” – it would ignore what is behind that comma

Output:

  • Output file is a single .CSV file, comma separated columns

Execute this way:

  1. Source Directory – where the .csv files reside
  2. Target Directory – where the new output .csv will be created
  3. open CMD / command prompt
  4. go to the script-directory (where you saved it)
    1. CSCRIPT scriptname.vbs “c:\sourcedirectory” “c:\targetdirectory”

CSCRIPT will avoid that you see a million message boxes – it will output directory on your CMD / command prompt window…