PDF Remove Metadata

This repository contains a Python tool that cleans metadata from a specified PDF document and linearizes it for improved web performance. The original file is replaced with the cleaned version.

Only for Testing Purposes

This tool is intended for educational and authorized penetration testing purposes only. Unauthorized use of this tool against systems that you do not have explicit permission to test is illegal and unethical.

Features
Getting Started
- Prerequisites
- Installation
Usage
- Adjusting the Script
- Running the Tool
Output

Features

This tool offers the following features:

Clean Metadata: Removes all metadata from the specified PDF file.
Display Metadata: Outputs old and cleaned metadata in the console for comparison.

Getting Started

Prerequisites

Before running the script, make sure you have the following installed:

Python 3.7 or higher
Python libraries:
- pikepdf
- exiftool

You can install these dependencies using pip:

pip install pikepdf exiftool

Installation

Clone the Repository:

git clone https://github.com/yourusername/pdf-metadata-cleaner.git
cd pdf-metadata-cleaner

Usage

Adjusting the Script

Before running the tool, open remove-metadata.py and change the path to the PDF file in the clean_pdf function to specify the file you want to edit. Make sure that the PDF file is in the same folder as the script.

Running the Tool

Run the script with the following command in the command line:

python remove-metadata.py

Output

The script will print the old and cleaned metadata in the console, allowing you to compare the changes. The original PDF file will be replaced by the cleaned version.

Table of Contents​

Features​

Getting Started​

Prerequisites​

Installation​

Usage​

Adjusting the Script​

Running the Tool​

Output​

Table of Contents