Skip to content

JSON to Parquet File Conversion using Python

Updated: at 12:00 AM

Python

You will need pandas library. You can install it via pip install pandas if you do not have.

main.py file

import pandas as pd
import sys

file_in = sys.argv[1]
file_op = sys.argv[2]

data = pd.read_json(file_in)
data.to_parquet(file_op)

JSON

data.json

[
  {
    "event_id": "b8a7eb74-cfca-49e4-83c4-728c10458c23",
    "title": "Zieme and Sons",
    "language": "en",
    "content": "entrepreneur I am Kate Daniel IV living in Fort Freeman and my email address is Ena62@gmail.com and August6@hotmail.com. I work for the company McKenzie, Koepp and Bednar my mobile to reach is 776.923.5653"
  },
  {
    "event_id": "ab83a922-44f7-4e57-b24a-92a9bd193ed5",
    "title": "Bode - Funk",
    "language": "en",
    "content": "I am Bessie Cormier my mobile to reach is 219-795-3134 x8903"
  },
  {
    "event_id": "bc459a28-3ce5-423e-b68a-05464e32fce0",
    "title": "Reichert Group",
    "language": "en",
    "content": "I am iving in Sandraborough. I work for the company Hermann - Herzog"
  },
  {
    "event_id": "430d3ba2-58a1-4b4f-9628-c6a99eced72a",
    "title": "Schamberger, Schmidt and Reilly",
    "language": "en",
    "content": "Hello I am veteran living in Deckowberg"
  },
  {
    "event_id": "ca40439d-4f0c-4490-b3c9-5e0282dd95f9",
    "title": "Cervantes - Bailey",
    "language": "en",
    "content": "Richard Tucker, born on July 28, 1981, living at 331 Mckinney Mount Cruzland, SD 16758, with a phone number of 540-308-0574x8345 and email address thomasrodriguez@gmail.com, recently applied for a loan."
  }
]

Running

python main.py ~/data.json ~/data.parquet