Methodology
Methodology
This page explains how Stock Market Words builds its dataset, extracts ticker symbols from text, and scores tickers across the site’s five strategy pages.
Last updated: March 13, 2026
The short version
The site combines a text parsing engine with a daily market-data pipeline:
- Build a universe of valid stock symbols.
- Filter out instruments that are not useful for this project.
- Enrich the remaining symbols with price, liquidity, and technical data.
- Score symbols against five educational strategy lenses.
- Render the results into Hugo pages, datasets, and ticker detail pages.
Data sources
The project uses market data gathered by the pipeline that powers this repository. The site distinguishes between:
- Raw FTP data โ the broad symbol universe before project filtering
- Filtered tickers โ the stock universe kept by this project
- Strategy datasets โ subsets that match specific screening logic
- Ticker detail pages โ per-symbol summaries built from the merged data file
You can inspect the raw and filtered views at:
Ticker extraction logic
The homepage tool does not simply split text on spaces and hope for the best. It uses a custom symbol matching approach designed to find valid ticker strings embedded inside normal text.
At a high level, the extractor:
- normalizes the input text
- walks the text with a prefix-aware search structure
- tries non-overlapping symbol matches
- backtracks when a local match blocks a better global solution
This matters because stock symbols are short and ambiguous. Many are also common English fragments. A naive approach would generate too many false positives.
For a full walkthrough, read How Ticker Extraction Works.
Filtering rules
The site does not keep every symbol that appears in the raw source data. The pipeline filters out many items that would reduce quality or make the research pages harder to use, including instruments that are outside the project’s target scope.
The exact pipeline evolves, but the practical goal is consistent:
- focus on tradable, recognizable equity symbols
- avoid clutter from low-quality or non-target instruments
- keep enough liquidity and data coverage for useful comparison
Base inclusion thresholds
Many site views rely on a practical minimum quality bar before a symbol is shown prominently:
- Price >= $5
- Volume >= 100,000
Those thresholds help remove many illiquid, low-information tickers from strategy views and ticker detail generation.
Strategy scores
The site uses five educational strategies. These are not claims of future performance. They are ways to rank the current dataset using different preferences.
Dividend Daddy
Looks for dividend-paying stocks and rewards:
- higher dividend yield
- lower beta / volatility
Moon Shot
Looks for aggressive growth candidates and rewards:
- higher beta
- RSI that is not already overbought
Falling Knife
Looks for deep pullbacks and rewards:
- low RSI
- price below the 200-day moving average
Over-Hyped
Looks for stretched momentum and rewards:
- very high RSI
Institutional Whale
Looks for large, liquid companies and rewards:
- larger market capitalization
Read the editorial guide How to Read the Five Strategies before treating any strategy table as a decision tool.
Why some tickers have no scores
Not every ticker page has a full strategy score block. That usually means one of three things:
- the symbol did not pass the base filters used for the strategy datasets
- the symbol had incomplete supporting data for the scoring step
- the symbol simply did not match any strategy criteria on the latest run
This is expected. Absence of a score is not a judgment that a company is bad; it usually means the site does not have enough aligned inputs to score it in a meaningful way.
Freshness and update cadence
The market data behind the site is refreshed by the project pipeline and then written into Hugo data files used by the site build. When you see a ticker page, strategy page, or filtered dataset, you are seeing a static view of the latest successfully generated data at build time.
Limitations
This site has important limits:
- it is a research and education tool, not a brokerage platform
- it simplifies complex investing ideas into compact screens
- it depends on upstream data quality and pipeline freshness
- it does not replace reading filings, earnings reports, or company guidance
Related pages
This site is for educational and informational purposes only and does not provide investment advice.
