# GoogleNewsDecoder
Decode Google News redirect URLs to their original source URLs.
Google News wraps every article link in an encrypted redirect (`news.google.com/rss/articles/CBMi...`). This library resolves those back to the real URL by extracting decoding parameters from the article page and calling Google's internal `batchexecute` endpoint.
**Zero dependencies** — uses Erlang's built-in `:httpc` for HTTP and Elixir's built-in `JSON` module for encoding/decoding.
This library **does not avoid** rate-limiting or CAPTCHAs presented by Google.
## Installation
Add `google_news_decoder` to your list of dependencies in `mix.exs`:
```elixir
def deps do
[
{:google_news_decoder, "~> 0.1.0"}
]
end
```
## Quickstart
### Decode a Google News URL
```elixir
{:ok, url} = GoogleNewsDecoder.decode("https://news.google.com/rss/articles/CBMiK2h0dHBz...")
# {:ok, "https://www.reuters.com/world/..."}
```
### Resolve (decode or pass through)
If you have a mix of Google News and regular URLs, `resolve/1` handles both — it decodes Google News URLs and returns everything else unchanged:
```elixir
GoogleNewsDecoder.resolve("https://news.google.com/rss/articles/CBMiK2h0dHBz...")
# "https://www.reuters.com/world/..."
GoogleNewsDecoder.resolve("https://example.com/article")
# "https://example.com/article"
```
### Check if a URL is a Google News redirect
```elixir
GoogleNewsDecoder.google_news_url?("https://news.google.com/rss/articles/CBMi...")
# true
GoogleNewsDecoder.google_news_url?("https://example.com")
# false
```
### Batch decoding
Decode a list of URLs concurrently with `Task.async_stream/3`:
```elixir
urls
|> Task.async_stream(&GoogleNewsDecoder.resolve/1, max_concurrency: 5, timeout: 15_000)
|> Enum.map(fn {:ok, url} -> url end)
```
## API
| Function | Returns | Description |
|---|---|---|
| `decode/1` | `{:ok, url}` or `{:error, reason}` | Decode a Google News URL to its source |
| `resolve/1` | `url` | Decode if Google News, otherwise pass through |
| `google_news_url?/1` | `boolean` | Check if a URL is a Google News redirect |
## How it works
1. Extracts the base64-encoded article ID from the URL path
2. Fetches the Google News article page to obtain a signature (`data-n-a-sg`) and timestamp (`data-n-a-ts`)
3. POSTs those parameters to Google's `batchexecute` endpoint
4. Parses the nested JSON response to extract the original source URL
## Requirements
- Elixir ~> 1.18 (for the built-in `JSON` module)
- Erlang/OTP 25+ (for `:public_key.cacerts_get/0`)