How I Made a Telegram Channel Archive

^{This article was generated by ChatGPT and slightly edited by a human being. It serves as a rough tutorial and reference for this idea.}

Example:

https://github.com/cppuix/personalEnglishTgChannel

Result:

https://cppuix.github.io/personalEnglishTgChannel/

Difficulty Level

Overall difficulty: Moderate

This project does require some technical familiarity, but it does not require advanced programming skills. Most of the work involves following structured steps rather than inventing new solutions.

You will be more comfortable with this setup if you are familiar with:

Using a terminal (running commands)
Handling files and folders
Basic Git usage (add, commit, push)
Basic web concepts (HTML, CSS, JavaScript)

If you are completely new to technical tools, expect some trial and error. Most difficulties usually come from small environment issues rather than the archive logic itself.

Using modern LLM tools can significantly reduce the difficulty, especially when building the static website interface.

We recommend copying and pasting this article into ChatGPT, Claude, Gemeni, or DeepSeek, and asking for a step-by-step guide and how to troubleshoot issues and errors.

However, some understanding is still helpful when troubleshooting problems.

In practical terms:

Beginner level: Challenging, but possible with patience
Intermediate level: Very manageable
Experienced users: Straightforward

Features

Searchable posts
Date and month filtering
Media filters
Arabic and English support
Images, videos, audio, PDFs, and files
Dark and light mode
Message links
Copy post text
Media hosted through Archive.org

This guide explains how to turn a Telegram channel export into a static archive website.

The setup has three main parts:

Export the Telegram channel data.
Upload media files to Archive.org, because it is too large.
Host the archive website as a static site.

1. Export the Telegram channel

From Telegram Desktop:

Open the channel.
Click the menu.
Choose Export chat history.
Export as JSON.
Include media if you want images, audio, video, and files to work in the archive.

The export folder may contain files like:

result.json
photos/
video_files/
files/
stickers/
voice_messages/
images/

The important file is:

result.json

That is the raw Telegram export.

2. Upload media to Archive.org

The website itself should stay light.

So media files can be uploaded to Archive.org instead of being stored inside the Git repository.

Install the Internet Archive CLI (I'm on Fedora Linux, you search for how it works on you OS):

sudo dnf install pipx
pipx ensurepath
pipx install internetarchive

Configure it:

ia configure

Upload the media folders:

ia upload YOUR-ARCHIVE-ITEM-NAME \
  photos/ \
  video_files/ \
  files/ \
  stickers/ \
  voice_messages/ \
  images/

Replace:

YOUR-ARCHIVE-ITEM-NAME

with the Archive.org item name you created.

If the upload stops or Archive.org rate-limits you, upload missing files slowly.

First create a local list:

find photos video_files files stickers voice_messages images \
  -type f -printf '%f\n' | sort > local-flat.txt

Then create an Archive.org list:

ia list YOUR-ARCHIVE-ITEM-NAME | sort > uploaded.txt

Compare them:

comm -23 local-flat.txt uploaded.txt > missing-flat.txt

Upload the missing files slowly:

while IFS= read -r name; do
  file=$(find photos video_files files stickers voice_messages images -type f -name "$name" | head -1)
  if [ -n "$file" ]; then
    echo "Uploading: $file"
    ia upload YOUR-ARCHIVE-ITEM-NAME "$file"
    sleep 12
  else
    echo "Could not find: $name"
  fi
done < missing-flat.txt

3. Process the Telegram JSON

The raw Telegram JSON is not ideal for the frontend.

So it is processed into a cleaner file:

result.processed.json

You will need this file: https://github.com/cppuix/personalEnglishTgChannel/blob/master/tools/preprocess.py

Make sure you follow the project folder structure I have in the linked repo.

Run:

python3 tools/preprocess.py result.json result.processed.json \
  --media-base https://archive.org/download/YOUR-ARCHIVE-ITEM-NAME/ \
  --flatten-media

This does several things:

Cleans the message data.
Converts text into paragraphs.
Handles Arabic and English paragraph direction.
Builds Archive.org media URLs.
Adds searchable text.
Adds tags for filtering.

The --flatten-media option is important because Archive.org may store uploaded files without their original folder paths.

Example:

photos/photo_1.jpg

becomes:

https://archive.org/download/YOUR-ARCHIVE-ITEM-NAME/photo_1.jpg

3.5 Build the Static Website

The archive website itself is just a static site made with:

HTML (structure)
CSS (design)
JavaScript (logic)

Its only job is to:

Load result.processed.json
Display messages
Allow searching and filtering
Show media using URLs

You do not need advanced web development skills to build this.

A practical approach is to use an LLM to generate the initial site structure, then refine it step by step.

Using an LLM to Generate the Website

You can ask an LLM to create a static viewer that reads the Telegram export JSON and displays it.

Start with a clear prompt describing what the site must do.

Example prompt:

Create a static website using only HTML, CSS, and vanilla JavaScript.

The website should:

- Load a file named result.processed.json
- Display messages in chronological order
- Show message text with proper paragraph formatting
- Display media (images, video, audio, files) using URLs inside the JSON
- Support searching messages by text
- Support filtering by date
- Handle both Arabic (RTL) and English (LTR) text correctly
- Use a clean, readable layout
- Work fully offline except for media URLs
- Not require any frameworks or backend

Return:

- index.html
- styles.css
- app.js

Keep the code simple and readable.

After generating the first version:

Test it locally
Adjust features gradually
Ask for improvements as needed

Example refinement prompts:

Add dark and light mode support.

Add month-based filtering.

Improve performance when loading large JSON files.

Add copy-to-clipboard button for message text.

Summary

The static site is simply a viewer that reads JSON and displays messages.

You can generate the first version using an LLM, then refine it step by step.

Most of the complexity comes from customization, not from the basic structure.

4. Run the archive locally

Start a local server:

python3 -m http.server 8080

Open:

http://localhost:8080

Do not open index.html directly by double-clicking it, because the browser may block loading the JSON file.

5. Deploy the archive

The archive is static, so it can be hosted on:

GitHub Pages
Vercel
Netlify
Any static host

The main files are:

index.html
styles.css
app.js
result.processed.json
assets/
fonts/

Commit and push:

git add .
git commit -m "Build Telegram archive"
git push

6. Updating the archive later

When you want to update the archive:

Export the channel again.
Replace result.json.
Upload any new media files to Archive.org.
Regenerate result.processed.json.
Test locally.
Commit and push.

Regenerate:

python3 tools/preprocess.py result.json result.processed.json \
  --media-base https://archive.org/download/YOUR-ARCHIVE-ITEM-NAME/ \
  --flatten-media

Test:

python3 -m http.server 8080

Commit:

git add result.json result.processed.json
git commit -m "Update archive data"
git push

Folder structure

A typical project folder looks like this:

index.html
styles.css
app.js
result.json
result.processed.json
assets/
  logo.jpg
fonts/
  ExampleFont-Regular.ttf
tools/
  preprocess.py

Summary

The archive works by turning a Telegram JSON export into a static website, while storing large media files on Archive.org.

Telegram provides the exported content.
Archive.org stores the media.
The static site presents everything in a searchable archive.

رقمنة