Interactive Jupyter notebooks demonstrating parallel-web-tools features.
Demonstrates Apache Spark integration with SQL-native UDFs:
- Register
parallel_enrich()UDF with Spark - Enrich data using SQL queries
- Parse JSON results into structured columns
- Batch enrichment with Python API
- Streaming enrichment with
foreachBatch - Processor selection guide
Prerequisites:
pip install parallel-web-tools[spark] jupyter-
Install dependencies:
pip install parallel-web-tools[spark] jupyter
-
Authenticate:
parallel-cli login
-
Start Jupyter:
jupyter notebook
-
Open the desired notebook and run the cells.
- Notebooks require Java Runtime Environment for Spark
- API calls may take a few seconds per row depending on processor
- Use
lite-fastprocessor for quick demos,profor deeper research