We just made exploring the relationships in your data that much easier. Already available for free Graphistry Hub accounts and Enterprise (Docker) users in v2.35, you can easily drop any CSV/XLS export into Graphistry and explore its relationships as a graph!
Edit 2/19/2021: … And if your data looks good, think about joining the Web App Hack graph hackathon for $15K+ in prizes and to share your ideas with 100+ companies / 6K+ graph users!
The Graphistry uploader kicks off our 2021 initiatives to enable true no-code and low-code visual graph analytics. To get a sense of how fast and easy it can be, we did a couple speed runs. For a YouTube CSV export, it only took 18 seconds, and most of the clicking was in the file picker (Animation 1). Any CSV works, so for analysts with log data, this article will look at a device activity export for a honeypot (credit: Mike Sconzo / secrepo.com).
Amazingly, it took only 8 clicks to map attackers<>targets, and then only 2 more clicks to explore attackers<>tactics! (Animation 5)
Animation 1: Drag-and-dropping a YouTube CSV export into Graphistry to automatically expose the relationships in it
Whether you’re an analyst curious what some events and entities look like, a data scientist exploring an embedding’s correlations, or an investigator working with a log dump, the new File Uploader helps you more quickly expose the key relationships. Underneath, you are still leveraging Graphistry’s GPU-accelerated visual graph analytics stack. After, you can share your interactive visualization, dig into APIs like PyGraphistry for using in websites and notebooks, and use the gallery to revisit your visualization.
The uploader’s simple controls, combined with Graphistry’s existing flexible visual capabilities, enable powerful workloads. We’ll walk through an example of visually exploring a CSV export of some security honeypot logs recorded in a SIEM:
1. Drag-and-drop your data
Animation 2: Drag-and-drop files or click to open the file picker
Drop in a file ending with “.csv”, “.xls”/”.xlsx”, and other popular formats. Splunk and Elastic let you do this for any search result, for example.
Amazingly, the data does not need to be in any custom graph data format, just regular data tables with regular data headers. Columns can contain values like IDs, text, numbers, and time. You can drop in multiple files, and if Excel, multiple sheets works great. You can also upload data with big data formats like ORC and Parquet, which are great for handling more data and keeping data types clean. The initial release already handles spreadsheets too big for Excel: Graphistry will automatically compress them from your browser and only then upload them into your account.
Later, as you play with your data, you can return to this screen to add more files.
2. Inspect & Clean
Animation 3: Inspect & clean
The file uploader already has initial support for data cleaning. After you drop your data, hit the next button and check details like:
- Files uploaded, including size
- Tables found in each file, including the number of rows and columns
- Column names and the type of their values
The inspection screen is great as a fast way to check for common data quality issues. We see hundreds of rows loaded with reasonable column names. However, the event start/end times are not shown with a recognizable Time format, but as Float64 numbers, so they should be fixed if we want to use exploration features like the interactive time bar. The two columns representing device IDs (IPs) do use the same type as one another as we would expect, so we can proceed. If we fixed the time values and wanted to upload the new file, we’d click the ‘1. Upload’ button and add the file in in the same way as the original.
3. Visualize and share
Animation 4: Visualize tables as graphs with a few clicks
In the most file uploader, we’re especially excited by the powerful step that enables you to turn any table into a graph without coding:
Expand the Visualize tab’s Shape -> Edge file section and pick a table to use for graph edges. Each table row will turn into an edge. Pick one column to use for the edge Source and another for the edge Destination. In this example, we have IP addresses of attackers and their victims, so we can start by showing who is attacking what victim. To do this, as each log line provides a column AttackerIP and column VictimIP address column, we will pick those. This creates an edge for every row with an (AttackerIP -> VictimIP) pair.
Setting the edge table’s source/destination columns is all you need for the live preview to kick in. We already see patterns pop out like how one device is being especially targeted, and how a few attackers are going after the same victims. This is already interesting, so we can copy the URL and share the interactive visualization with our teammates. The last visualization of a session is what goes into your personal visualization gallery, so whenever we stop, we can always go there to find it again.
4. Interactively edit and reshape!
Much of the power of Graphistry is in its interactivity. For the uploader, all the fields are live-editable, such as the visualization name and description. When you pause editing, the live preview over your uploaded data will reload. This is powerful because you can use Graphistry’s visual insights to guide you in making even better ones… including exploring different enlightening graph shapes for the same data with just one or two clicks.
For example, when looking at the graph of Attacker IP -> Victim IP, we might notice the column Vulnerability is interesting. It gives context for the tactics behind how different attackers are approaching their victims. Vulnerabilities tell us a lot about the attack, but also about both the attackers and the impacted systems. For a simple start, we can use the visualization to inspect the top 10 vulnerabilities. That does not clarify where they are being used, so we go further by visually color the edges by the vulnerability used. The resulting map of Attacker IP –[vulnerability]–> VictimIP shows interesting color patterns: Something is happening.
Animation 5: Dropdown menu to flip from mapping Attackers<->Victims to Attackers<->Tactics
Our favorite feature of the uploader, beyond a few point-and-click interactions to go from the table to graph, is using the same capability to explore different graphs. With just two clicks, we can flip the map from “Attacker IP –[vulnerability]–> VictimIP” to “Attacker IP –> Vulnerability”. As different attackers use different tactics, this gives us a sense of the different campaign behavior. For example, many attackers do a quick check for the same popular vulnerability and leave, appearing as a hub-and-spoke pattern where a vulnerability is circled by attackers with no other activity. A few attackers do go deeper, an we see a few distinct campaigns appearing as separate clusters of attacker+vulnerability combinations.
Other graph shapes are useful. For example, we can also add a node table with a column DeviceIP and then additional columns describing the IP, like Name or TimeFirstSeen. Whenever a value in the columns AttackerIP or VictimIP use one found in the DeviceIP column, Graphistry will automatically enrich its node with the DeviceIP node data. The next versions of Graphistry will be enabling even more powerful shaping options.
Try for yourself!
We’re already having a blast running different CSVs through the file uploader. To try it on your own data export, log in to a free Graphistry Hub account and drop in your own data. For private data, file uploader is part of the v2.35 release, which came out today for Enterprise users (Docker) and next week on Graphistry AWS+Azure Marketplace where you can you one-click launch an instance for your private cloud.
Curious? Join the graph hackathon!
In parallel, we’re announcing The Web App Hack, where you can win $15K+ in prizes and get your ideas shared with 100+ companies and 6K+ graph users!