How I Made a Telegram Channel Archive
Difficulty Level
Overall difficulty: Moderate
This project does require some technical familiarity, but it does not require advanced programming skills. Most of the work involves following structured steps rather than inventing new solutions.
You will be more comfortable with this setup if you are familiar with:
- Using a terminal (running commands)
- Handling files and folders
- Basic Git usage (add, commit, push)
- Basic web concepts (HTML, CSS, JavaScript)
If you are completely new to technical tools, expect some trial and error. Most difficulties usually come from small environment issues rather than the archive logic itself.
Using modern LLM tools can significantly reduce the difficulty, especially when building the static website interface.
We recommend copying and pasting this article into ChatGPT, Claude, Gemeni, or DeepSeek, and asking for a step-by-step guide and how to troubleshoot issues and errors.
However, some understanding is still helpful when troubleshooting problems.
In practical terms:
- Beginner level: Challenging, but possible with patience
- Intermediate level: Very manageable
- Experienced users: Straightforward
Features
- Searchable posts
- Date and month filtering
- Media filters
- Arabic and English support
- Images, videos, audio, PDFs, and files
- Dark and light mode
- Message links
- Copy post text
- Media hosted through Archive.org
The setup has three main parts:
- Export the Telegram channel data.
- Upload media files to Archive.org, because it is too large.
- Host the archive website as a static site.
1. Export the Telegram channel
From Telegram Desktop:
- Open the channel.
- Click the menu.
- Choose Export chat history.
- Export as JSON.
- Include media if you want images, audio, video, and files to work in the archive.
The export folder may contain files like:
result.json photos/ video_files/ files/ stickers/ voice_messages/ images/
The important file is:
result.json
That is the raw Telegram export.
2. Upload media to Archive.org
The website itself should stay light.
So media files can be uploaded to Archive.org instead of being stored inside the Git repository.
Install the Internet Archive CLI (I'm on Fedora Linux, you search for how it works on you OS):
sudo dnf install pipx pipx ensurepath pipx install internetarchive
Configure it:
ia configure
Upload the media folders:
ia upload YOUR-ARCHIVE-ITEM-NAME \ photos/ \ video_files/ \ files/ \ stickers/ \ voice_messages/ \ images/
Replace:
YOUR-ARCHIVE-ITEM-NAME
with the Archive.org item name you created.
If the upload stops or Archive.org rate-limits you, upload missing files slowly.
First create a local list:
find photos video_files files stickers voice_messages images \ -type f -printf '%f\n' | sort > local-flat.txt
Then create an Archive.org list:
ia list YOUR-ARCHIVE-ITEM-NAME | sort > uploaded.txt
Compare them:
comm -23 local-flat.txt uploaded.txt > missing-flat.txt
Upload the missing files slowly:
while IFS= read -r name; do
file=$(find photos video_files files stickers voice_messages images -type f -name "$name" | head -1)
if [ -n "$file" ]; then
echo "Uploading: $file"
ia upload YOUR-ARCHIVE-ITEM-NAME "$file"
sleep 12
else
echo "Could not find: $name"
fi
done < missing-flat.txt
3. Process the Telegram JSON
The raw Telegram JSON is not ideal for the frontend.
So it is processed into a cleaner file:
result.processed.json
You will need this file: https://github.com/cppuix/personalEnglishTgChannel/blob/master/tools/preprocess.py
Make sure you follow the project folder structure I have in the linked repo.
Run:
python3 tools/preprocess.py result.json result.processed.json \ --media-base https://archive.org/download/YOUR-ARCHIVE-ITEM-NAME/ \ --flatten-media
This does several things:
- Cleans the message data.
- Converts text into paragraphs.
- Handles Arabic and English paragraph direction.
- Builds Archive.org media URLs.
- Adds searchable text.
- Adds tags for filtering.
The --flatten-media option is important because Archive.org may store uploaded files without their original folder paths.
Example:
photos/photo_1.jpg
becomes:
https://archive.org/download/YOUR-ARCHIVE-ITEM-NAME/photo_1.jpg
3.5 Build the Static Website
The archive website itself is just a static site made with:
- HTML (structure)
- CSS (design)
- JavaScript (logic)
Its only job is to:
- Load
result.processed.json - Display messages
- Allow searching and filtering
- Show media using URLs
You do not need advanced web development skills to build this.
A practical approach is to use an LLM to generate the initial site structure, then refine it step by step.
Using an LLM to Generate the Website
You can ask an LLM to create a static viewer that reads the Telegram export JSON and displays it.
Start with a clear prompt describing what the site must do.
Example prompt:
Create a static website using only HTML, CSS, and vanilla JavaScript. The website should: - Load a file named result.processed.json - Display messages in chronological order - Show message text with proper paragraph formatting - Display media (images, video, audio, files) using URLs inside the JSON - Support searching messages by text - Support filtering by date - Handle both Arabic (RTL) and English (LTR) text correctly - Use a clean, readable layout - Work fully offline except for media URLs - Not require any frameworks or backend Return: - index.html - styles.css - app.js Keep the code simple and readable.
After generating the first version:
- Test it locally
- Adjust features gradually
- Ask for improvements as needed
Example refinement prompts:
Add dark and light mode support. Add month-based filtering. Improve performance when loading large JSON files. Add copy-to-clipboard button for message text.
Summary
The static site is simply a viewer that reads JSON and displays messages.
You can generate the first version using an LLM, then refine it step by step.
Most of the complexity comes from customization, not from the basic structure.
4. Run the archive locally
Start a local server:
python3 -m http.server 8080
Open:
http://localhost:8080
Do not open index.html directly by double-clicking it, because the browser may block loading the JSON file.
5. Deploy the archive
The archive is static, so it can be hosted on:
- GitHub Pages
- Vercel
- Netlify
- Any static host
The main files are:
index.html styles.css app.js result.processed.json assets/ fonts/
Commit and push:
git add . git commit -m "Build Telegram archive" git push
6. Updating the archive later
When you want to update the archive:
- Export the channel again.
- Replace
result.json. - Upload any new media files to Archive.org.
- Regenerate
result.processed.json. - Test locally.
- Commit and push.
Regenerate:
python3 tools/preprocess.py result.json result.processed.json \ --media-base https://archive.org/download/YOUR-ARCHIVE-ITEM-NAME/ \ --flatten-media
Test:
python3 -m http.server 8080
Commit:
git add result.json result.processed.json git commit -m "Update archive data" git push
Folder structure
A typical project folder looks like this:
index.html styles.css app.js result.json result.processed.json assets/ logo.jpg fonts/ ExampleFont-Regular.ttf tools/ preprocess.py
Summary
The archive works by turning a Telegram JSON export into a static website, while storing large media files on Archive.org.
Telegram provides the exported content.
Archive.org stores the media.
The static site presents everything in a searchable archive.
Comments
Post a Comment