Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: [email protected]
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab VeloxTrend Ultrarix Capital Partners Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-FranΓ§ois Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Take our 85+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

Publication

Automating Competitive Intel: From 10-Q to Insights in Minutes
Latest   Machine Learning

Automating Competitive Intel: From 10-Q to Insights in Minutes

Author(s): dave ginsburg

Originally published on Towards AI.

In the fast-paced world of marketing, gaining a rapid yet rigorous financial snapshot of your competitors can mean the difference between seizing an opportunity and missing the mark. Beyond press releases and news blips, the best source of unfiltered insight often lies in a company’s own SEC filings β€” specifically the 10-Q quarterly reports. These documents lay out everything from emerging risks to executive commentary on performance and strategy.

Introduction

To eliminate the manual slog of downloading PDFs, hunting through dozens of pages, and scribbling notes, I built a lightweight Python toolkit powered by OpenAI and SEC-API. In a single command, you can:

  1. Specify up to six tickers and automatically pull their latest 10-Q filings (Item 1A: Risk Factors and Item 2: MD&A).
  2. Fetch market data from Yahoo Finance for the past year β€” EPS, revenue, margins, and share-price movements all indexed to zero. Note that some of the trendlines are off traditional quarter boundaries, as some companies have non-standard quarters.
  3. Summarize each section via the ChatGPT API, chunking longer passages to fit within context limits. Note that focus prompts, which can be changed to emphasize other areas within items 1A and 2.
  4. Compile the distilled bullet-point takeaways and four customizable charts into a polished PDF report β€” ready to share with your team.
Automating Competitive Intel: From 10-Q to Insights in Minutes
Source: Author

While SEC-API (at $55/month) handles the heavy lifting of SEC extraction, and ChatGPT powers the rapid analysis at a minimal spend of 136K tokens (for the tickers selected) with gpt-4o-mini, the entire process runs in minutes, not hours.

Note that in the scheme of things, the SEC-API cost is not out of line, and it is a much more stable method of pulling reports than many of the alternatives. Future enhancements to the script could automatically detect insider transactions or real-time material events by pulling different filings, but even in its current form, this script transforms quarterly filings from a chore into a strategic advantage.

The following sections describe the script, approaches taken, and other tools leveraged. The complete Python script follows this description, and you’ll need to specify your API keys as well as the base directory for files.

1. Configuration & Setup

  • API Keys & Clients
    We pull the SEC-API and OpenAI keys from environment variables (with sensible defaults hardcoded for development), then instantiate the QueryApi, ExtractorApi, and the OpenAI client.
    Why? Centralizing credentials at the top makes it easy to swap in new keys or switch to a test environment without hunting through code.
  • Work Directory & Ticker List
    All outputs (raw JSON, charts, PDF) land under /home/dave/finance. We accept up to six tickers via the command line, validating input count immediately.
    Why? A single β€œdrop zone” for artifacts keeps things tidy, and CLI flexibility lets you process any subset of companies quickly.

2. SEC 10-Q Fetch & Extraction

for each ticker:
β€’ Query the latest 10-Q filing
β€’ Pull two key sections:
– Item 1A: Risk Factors
– Item 2: Management Discussion & Analysis (MD&A)
β€’ Save as `<ticker>_10q.json`
  • Why we use SEC-API instead of HTML scraping:
    It returns clean JSON blobs for exactly the sections we need β€” no fragile XPath or PDF parsing.
  • Error handling per section ensures that if one extraction fails, the script continues for the others.

3. Financial Data & Chart Generation

for each ticker:
β€’ Download 12 months of daily closing prices via yfinance
β€’ Pull quarterly EPS, Revenue, Gross Profit
β€’ Compute quarterly gross margin % and normalize price to a 0-baseline
β€’ Render four Matplotlib line charts
– EPS
– Revenue
– Gross Margin %
– Indexed Share-Price Change
  • Why Matplotlib + simple loops?
    It’s lightweight, requires no fancy styling, and is fully scriptable in a headless server environment.
  • Why normalize & quarterly‐aggregate?
    Quarterly snapshots smooth out daily noise and give a consistent cadence for comparison across companies.
Example chart from analysis

4. Chunked GPT Summarization

Summarize section(text, section, ticker):
1. Split text into ≀15 000-char chunks
2. For each chunk:
– Build a precise GPT prompt:
β€’ β€œExtract the most important … Respond with at most 8 concise bullet points …”
– Call gpt-4o-mini
– Regex‐extract the JSON array out of the raw reply
– Append and dedupe
3. Return the top 8 unique bullets
  • Chunking
    Long SEC sections (up to ~130,000 characters) would blow past GPT’s ~16,000-token window. By slicing into 15,000-character pieces, each API call stays safely within limits.
  • JSON Extraction with Regex
    Even with explicit β€œoutput only JSON” instructions, models sometimes prepends or appends stray text. A simple re.search(r”\[.*\]”) guards against parse‐errors by isolating the JSON array.
  • Deduplication & Limiting
    We gather bullets from every chunk, convert them explicitly to strings (avoiding unhashable dicts), then keep only the first 8 unique points. This yields a tight, non-redundant summary.
  • Exponential Backoff with Jitter
    Rate limits and transient errors are handled by retrying up to six times per chunk, waiting 30 β†’ 60 β†’ 120 s between tries, plus a bit of random β€œjitter” to avoid thundering‐herd API calls.

5. PDF Assembly

doc = SimpleDocTemplate('Full_Report.pdf')
for each ticker:
β€’ Heading: ticker symbol
β€’ Subheading: Key Risks
– Bullet list from JSON summary
β€’ Subheading: Management Discussion & Analysis
– Bullet list from JSON summary
β€’ Page break
β€’ Final section: β€œFinancial Performance Overview”
– Embed the four charts (EPS, Revenue, Margin, Price)
doc.build(elements)
  • Why ReportLab?
    It’s a pure-Python library with straightforward flow-document abstractions (paragraphs, images, page breaks) and no external dependencies.
  • Why PDF?
    A single, self-contained deliverable that can be emailed, archived, or printed without worrying about missing images or mixed file types.

6. Putting It All Together

  1. Data Gathering
    β€” raw SEC text + financial time series
  2. Analysis & Summarization
    β€” automated, bullet-point distillation of prose
  3. Visualization
    β€” clear charts to compare key metrics
  4. Reporting
    β€” one PDF combining narrative and graphics

This pipeline ensures repeatability, scalability (add more tickers as needed), and robustness (rate‐limit/backoff, chunking, JSON‐safeguards). It can be integrated into nightly jobs, reporting dashboards, or be adapted to other SEC filings (e.g. 10-Ks, 8-Ks) with minimal changes.

From PDF summary, Nvidia Management Discussion & Analysis

7. Python Script

Available at: https://github.com/daveginsburg/financial_reporting.git or https://gist.github.com/daveginsburg/6cac662c83901d167cbac63e8f8bc410

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.

Published via Towards AI

Feedback ↓