Sign Up

Using Splitters

File-splitter

This is a special kind of node that you can use to loop over a file and execute a series of steps in a workflow on each line, possibly in parallel.

To get started, use your search in Left sidebar to find file-splitter. Connect your file (or tool) to file-splitter ’s multiple input. You will get a string output that you can connect to any other tool. Any tool you connect to file-splitter (including tools further down the chain) will run once for each line in the input file.

In the example below, 2 parallel subfinder instances will run:

subfinder -d trickest.io

And

subfinder -d trickest.com
GIF of a custom made workflow of three nodes and running the splitter node

Trickest file-splitter in action

Tips

  1. Every tool connected to a file-splitter using a file connection (the orange one), will run separate instances for each line. To collect the outputs of every instance in one node, use a folder connection (the yellow one) like in the example above. You can use a script like recursively-cat-all to merge all outputs into one file, zip-to-out to compress the output folder (while maintaining the directory structure) into a zip archive, or write your own custom script.
  2. Assigning more machines to a workflow will allow the file-splitter nodes to run in parallel, resulting in a shorter workflow runtime and faster results.
  3. Currently, there is a 500-line limit on the file-splitter input. If you need more iterations, you should consider modifying the workflow structure. It could make more sense to use tools that can process files rather than individual strings. Check out the following section for more details on how to batch-process a file.

Batch-output

You can use file-splitter long with 2 more nodes to create an even more powerful pattern. This pattern allows you to split a file into smaller files (batches) with a specific number of lines in each batch, then pass these batches to the following nodes concurrently.

Screenshot of the workflow editor with three nodes and file splitter node interconnected

Trickest batch-output

Start by dragging the required nodes (generate-line-batches, file-splitter, and batch-output) from the Left sidebar and connect them as shown in the image, and connect them to your desired tool.

The way this works is as follows:

  • generate-line-batches calculates line ranges based on the number of lines in the file and the specified batch size.
  • file-splitter passes the line ranges one by one to batch-output.
  • batch-output extracts the specified line range from the file into a batch and passes each batch concurrently to the following nodes.

Tips

  1. The default batch size is 100 lines. If you want to change it, edit the BATCH_SIZE variable at the top of the generate-line-batches script.
  2. All the file-splitter tips apply here as well. Give them a read!