Graphistry makes it easy to explore the hidden connections in any CSV or flat file by automatically exposing the underlying graph. This tutorial walks through the CSV Mini-App notebook that comes with Graphistry and applies it to visualizing the recent Implant Files medical device recalls database by the ICIJ.
Screenshot: ICIJ’s The Implant Files visualized live with Graphistry – The pandemic of 70,000+ medical device recalls
1. Setup
- Download the recall data from the ICIJ and unzip to get the event data, events-1551346702.csv
- Launch Graphistry!
2. Go through the video tutorial!
- Launch and clone the CSV Upload Mini-App notebook, and rename to ‘icij_implants.ipynb’
- Follow the instructions in the notebook
- Settings used for each section:
- Upload:file_path = ‘./events-1551346702.csv’
- Data cleaning:hits = pd.DataFrame([[c, len(df[c].unique())] for c in df.columns], columns=[‘col’, ‘num_uniq’]).sort_values(‘num_uniq’)skip_nodes = [‘icij_notes’, ‘determined_cause’, ‘action_classification’, ‘icij_notes’, ‘country’, ‘status’, ‘source’]
nodes = [x for x in list(hits.query(‘num_uniq > 10 & num_uniq < 9288’)[‘col’]) if not x in skip_nodes]df = df_orig.query(‘country == ‘USA”) - Plotting:mode = ‘B’
max_rows = 50000
node_cols = nodes
categories = { }
Next steps & further reading
- Pandas pd.read_csv() : Load most files
- Graphistry’s graphistry.hypergraph(): Turn any Pandas dataframe into a live graph visualization
- Graphistry in Jupyter notebooks + UI guide
- ICIJ’s Implant Files investigation (+ the data!)
- Resulting use of CSV Mini-App notebook
- Graphistry 2.0 + Launch in your AWS