the:behavioral:lab

Converting a JSON file to CSV

JSON doesn’t map perfectly to XML hierarchies, and definitely doesn’t have an ideal proxy for formatting as a CSV. However, JSON data doesn’t load into Excel for analysis purposes. Existing code I’ve found in forums for turning JSON into CSV simply uses json_decode() and then used fputcsv(), which ignores if the columns have different headers. To solve this, I wrote some code in PHP that converts JSON into an array [using json_decode()], then into XML so that I could plug into into my XML to CSV parser. There may be a simpler way, but I already had the XML to CSV script written.

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
<?PHP
//All you need to change is the first two lines, when the data comes from and where it goes.
//The top is the procedural elements. Get data-&gt;remove line breaks-&gt;declare vars-&gt;convert json to array-&gt;Flatten array-&gt;convert to XML-&gt;convert to CSV
$json_str = file_get_contents('json.txt');
$filename = "target.csv";
$white= array("\\r\\n", "\\n", "\\r");
$replace = ' ';
$json_str = str_replace($white, $replace, $json_str);//line breaks cause malformed XML
$json_obj = json_decode($json_str,true);
$json_arr = array();
$return = array();
$xml = new SimpleXMLElement("");
foreach ($json_obj as $fields) {
    $json_arr[]=flatten($fields);
    $return=array();
}
foreach($json_arr as $val)
{
	array_to_xml($val,$xml);
	XMLtoCSV($xml);
	$xml=new SimpleXMLElement("");//CSV must be written one row at a time. Each element in the array is single row. Must re-initialize XML afterwards.
}
echo("Completed");//If your file is huge (mine was 300mb) it may take a while to create. Its good to know when it's done.
function flat ($a,$b){global $return; $return[$b] = $a;}//my web hosting company doesn't have PHP 5.3 for some reason so I can't pass a closure to the flatten() function. There may be an easier way still... but this works at least.
function flatten($array) {
	global $return;
	array_walk_recursive($array, 'flat');
	return $return;
}
//Simple conversion script for array to xml. The XML is not necessary well-formed. Indexed arrays will yield elements with numbers for names.
function array_to_xml($arr, &amp;$xml) {
    foreach($arr as $key =&gt; $value) {
        if(is_array($value)) {
            if(!is_numeric($key)){
                $subnode = $xml-&gt;addChild("$key");
                array_to_xml($value, $subnode);
            }
            else{
                array_to_xml($value, $xml);
            }
        }
        else {
            $xml-&gt;addChild("$key","$value");
        }
    }
}
//This parser was designed to take XML data and write it to a CSV mapping data to specific columns based on the column name, and adding columns when necessary. I wrote it a long time ago, and the code gets edited more than the comments get updated. Some of the comments may be off.
function XMLtoCSV($data){
	global $filename;
	if(file_exists($filename))
	{
		$fh = fopen($filename, "r");
		$existing_columns = fgetcsv($fh); //this is an indexed array of all the column headings
		$original_columns = $existing_columns;
		$num_original_columns = count($original_columns);
		//go through all the new data and map it onto the existing stuff...create new headings for the new data
		//first dump the XML $data columns into an array
		$x=0;
		foreach($data as $key =&gt; $value)
		{
			$new_columns_array[$x] = $key;
			$x++;
		}
		$num_newdata = count($new_columns_array);
		$numberofnewvariables = 0;
		for($x=0;$x$location)
		{
			$write_array[$location] = $key;
		}
		//next create data array from XML
		$x=0;
		foreach($data as $key =&gt; $value)
		{
			$value = str_replace("'","",$value);
			$value = str_replace('"','',$value);
			$value = str_replace(',','',$value);
			$value = str_replace(';','',$value);
			$new_values_array[$x] = $value;
			$x++;
		}
		if($numberofnewvariables &gt; 0)
		{			
			$fh = fopen($filename, "r");
			$i=0;
			while(!feof($fh))
			{
				$data2[$i++]=fgets($fh);
			}
			$first_row=explode(",",$data2[0]);
			//add new column headers here
			$temp_numcolumns = count($first_row);

			for($x=0;$x&lt;$numberofnewvariables;$x++)
			{
				$first_row[$temp_numcolumns + $x] = trim($new_columns_array[$write_array[$num_original_columns+$x]]);
			}
			for($x=0;$x&lt;count($first_row);$x++)
			{
				$first_row[$x] = preg_replace( '/\r\n/', '', trim($first_row[$x]) );
				$newfirst_row .= $first_row[$x].",";
			}
			$newfirst_row .="\n";
			$data2[0]=$newfirst_row;
			fclose($fh);
			$fh = fopen($filename,"w");
			$i=0;
			while($i&lt;count($data2))
			{	
				fputs($fh,$data2[$i]);
				$i++;
			}
			fclose($fh);	
		}
		$fh = fopen($filename, "r");
		$total_columns = fgetcsv($fh);
		$num_total_columns = count($total_columns);
		fclose($fh);

		$fh = fopen($filename, "a");
		for($x=0;$x&lt;$num_total_columns;$x++) 		{ 			fwrite($fh, "\"". $new_values_array[$write_array[$x]]."\","); 		 		} 		fwrite($fh,"\n"); //terminate the line; 		fclose($fh); 	} 	else 	{ 		$fh =fopen($filename, 'w'); 		foreach($data as $key =&gt; $value)
		{
			//first loop creates column headers
			fwrite($fh,"$key,");
		}
		fwrite($fh,"\n");

		foreach($data as $key =&gt; $value)
		{
			//second  loop writes the data (and cleans it)
			$value = str_replace("'","",$value);
			$value = str_replace('"','',$value);
			$value = str_replace(',','',$value);
			$value = str_replace(';','',$value);
			fwrite($fh,"\"$value\",");
		}	
		fwrite($fh,"\n"); //terminate the line;
		fclose($fh);
	}
}
?>
Advertisements

Single Post Navigation

5 thoughts on “Converting a JSON file to CSV

  1. Sorry for the HTML escaped characters in the code. Word Press really need to come up with a reliable way to put code into blogs. Until I find the time to fix it, just do a word search in your text editor for & and replace accordingly.

  2. hugo on said:

    script has wrong synax, it’s a pitty…

    • Yeah. I mentioned in a comment that some things were messed up when pasting the code into WordPress. I need to fix it, but have been avoiding it. I’ll try to fix it today.

  3. Hey There. I found your blog using msn. This is a very well written article.
    I will be sure to bookmark it and come back to read more of
    your useful info. Thanks for the post. I will definitely comeback.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: