For more than 15 lakh (1.5 million) records, here’s a detailed recommendation:


Best Choice: CSV (for large flat data)

Reason:

  • CSV can be processed line-by-line with very low memory usage.
  • PHP can stream through the file using fgetcsv() without loading everything into memory.
  • JSON loads the entire file at once, which will likely crash or be too slow with 15+ lakh records.

Example: Efficient CSV Reading in PHP (Memory-Friendly)

$handle = fopen("bigdata.csv", "r");
if ($handle !== FALSE) {
while (($row = fgetcsv($handle, 0, ",")) !== FALSE) {
// Process each row (e.g., insert into DB, print, etc.)
// Example:
// print_r($row);
}
fclose($handle);
}
  • Efficient even for gigabyte-size CSV files.
  • You can even add a counter if needed:
$count = 0;
while (($row = fgetcsv($handle, 0, ",")) !== FALSE) {
$count++;
// Process row
}
echo "Total rows: $count";

Avoid JSON for 15+ Lakh Rows

phpCopyEdit$json = file_get_contents("bigdata.json"); //High memory usage
$data = json_decode($json, true); // Very slow / may crash

Problems:

  • Loads entire file into RAM.
  • May hit PHP memory limits (e.g., 128MB or 512MB).
  • Not suitable for huge datasets unless using chunked streaming, which is complex.

Summary

FormatBest ForHandles 15+ lakh data?SpeedMemory Use
CSVFlat/tabular data YesFastLow
JSONNested/complex data No (not recommended)SlowerHigh

Final Advice

  • Use CSV if your data is flat (like rows from a database or Excel).
  • If you’re dealing with hierarchical or nested data and must use JSON, consider splitting the file into smaller parts or using streaming parsers like JSONMachine.



Leave a Reply