Installation is made easy due to conda environments. Simply run
conda env create -f environment.yml from the root project directory and conda will create an environment called
lecture2notes with all the required packages from
..note:: Read the paper for more in-depth explanations regarding the background, methodology, and results of this project.
Info About Optional Components¶
Certain functions in the End-To-End
transcribe.py file require additional downloads. If you are not using the transcribe feature of the End-To-End approach then this notice can safely be ignored. These extra files may not be necessary depending on your configuration. To use the similarity function to compare two transcripts a spacy model is needed, which you can learn more about on the spacy starter models and core models documentation.
The default transcription method in the End-To-End process is to use
vosk. You need to download a
vosk model from the models page (Google Drive Mirror) to use this method or you can specify a different method with the
--transcription_method flag such as
figure_detection.py contains a function called
detect_figures(). This function requires the EAST (Efficient and Accurate Scene Text Detector) model by default due to the
do_text_check argument defaulting to
True. See the docstring for more information. You can download the model from Dropbox (this link was extracted from the official code) or Google Drive (my mirror). Then just extract the file by running
tar -xzvf frozen_east_text_detection.tar.gz.
Quick-Install (Copy & Paste)¶
git clone https://github.com/HHousen/lecture2notes.git cd lecture2notes conda env create conda activate lecture2notes python -m spacy download en_core_web_sm gdown "https://drive.google.com/uc?id=1eXwWQujo_0HVffuUx0Fa6KydjW8h4gUb" -O lecture2notes/end_to_end/model_best.ckpt
Extras (Linux Only):
Install extras only after the above commands have been run.
sudo apt install curl sudo curl -L https://yt-dl.org/downloads/latest/youtube-dl -o /usr/local/bin/youtube-dl sudo chmod a+rx /usr/local/bin/youtube-dl sudo apt install ffmpeg sox wget poppler-utils
Commands to download a Vosk model (needed for speech-to-text) are available on the 4. Vosk transcription method page.
Clone this repository:
git clone https://github.com/HHousen/lecture2notes.git.
Change to project directory:
Run installation command:
conda env create.
Activate newly created conda environment:
conda activate lecture2notes.
Run gdown “https://drive.google.com/uc?id=1eXwWQujo_0HVffuUx0Fa6KydjW8h4gUb” -O lecture2notes/end_to_end/model_best.ckpt from the project root to download the slide classification model and put it in the default expected location.
Other Binary Packages: Install
sudo apt install ffmpeg sox wget poppler-utilsif on linux. Otherwise, navigate to the sox homepage to download
sox, the youtube-dl homepage (GitHub) to download
youtube-dl, and follow the directions in this StackOverflow answer (Windows) to install
poppler-utilsfor your platform.
ffmpegis needed for frame extraction in
soxis needed for automatic audio conversion during the transcription phase of
wgetis used to download videos that are not on youtube as part of the
video_downloaderscraper script in
- End-To-End Process Requirements (Optional)
Spacy: Download the small spacy model by running
python -m spacy download en_core_web_smin the project root. This is required to use certain summarization and similarity features (as discussed above). A spacy model is also required when using spacy as a feature extractor in
DeepSpeech/Vosk: Download the
.pbmmacoustic model and the scorer) from the releases page. To reduce complexity save them to
deepspeech-modelsin the project root. 3 Alternatively, it is recommended to download the small vosk model using the commands on the 4. Vosk transcription method page.
EAST: Download the
EASTmodel from Dropbox or by running
gdown https://drive.google.com/uc?id=1ZVn7_g58g4B0QNYNFE6MzRzpirsNTjwe. Extract it to the
End-To-Enddirectory by running
tar -xzvf frozen_east_text_detection.tar.gz -C end_to_end/
- Dataset Collection Requirements (Optional) YouTube API
cp .env.example .envto create a copy of the example
Add your YouTube API key to your
You can now use the scraper scripts to scrape YouTube and create the dataset needed to train the slide classifier.
Transcript Download w/YouTube API (Not Recommended) If you want to download video transcripts with the YouTube API 4, place your
dataset/scraper-scriptsfolder (if you want to download transcripts with the
scraper-scripts) or in
End-To-End(if you want to download transcripts in the entire end-to-end process that converts a lecture video to notes).
If your audio is 16000Hz, 1 channel, and
soxis not needed.
The default is not to use spacy for feature extraction but the large model (which can be downloaded with
python -m spacy download en_core_web_lg) is the default if spacy is manually chosen. So make sure to download the large model if you want to use spacy for feature extraction.
Folder name and location do not matter. Just make sure the scorer and model are in the same directory. The scripts will automatically detect each when given the path to the folder containing them.
The default is to use
youtube-dlwhich needs no API key.