Installation
Overview
Installation is made easy due to conda environments. Simply run conda env create -f environment.yml from the root project directory and conda will create an environment called lecture2notes with all the required packages from environment.yml.
..note:: Read the paper for more in-depth explanations regarding the background, methodology, and results of this project.
Info About Optional Components
Certain functions in the End-To-End transcribe.py file require additional downloads. If you are not using the transcribe feature of the End-To-End approach then this notice can safely be ignored. These extra files may not be necessary depending on your configuration. To use the similarity function to compare two transcripts a spacy model is needed, which you can learn more about on the spacy starter models and core models documentation.
The default transcription method in the End-To-End process is to use vosk. You need to download a vosk model from the models page (Hugging Face Mirror) to use this method or you can specify a different method with the --transcription_method flag such as --transcription_method wav2vec.
The End-To-End figure_detection.py contains a function called detect_figures(). This function requires the EAST (Efficient and Accurate Scene Text Detector) model by default due to the do_text_check argument defaulting to True. See the docstring for more information. You can download the model from Dropbox (this link was extracted from the official code) or Hugging Face. Then just extract the file by running tar -xzvf frozen_east_text_detection.tar.gz.
Quick-Install (Copy & Paste)
git clone https://github.com/HHousen/lecture2notes.git
cd lecture2notes
conda env create
conda activate lecture2notes
python -m spacy download en_core_web_sm
wget "https://huggingface.co/HHousen/lecture2notes/resolve/main/Slide%20Classifier%20Median%20Weights/three-category/epoch%3D8.ckpt" -O lecture2notes/end_to_end/model_best.ckpt
Extras (Linux Only):
Install extras only after the above commands have been run.
sudo apt install curl
sudo curl -L https://yt-dl.org/downloads/latest/youtube-dl -o /usr/local/bin/youtube-dl
sudo chmod a+rx /usr/local/bin/youtube-dl
sudo apt install ffmpeg sox wget poppler-utils
Commands to download a Vosk model (needed for speech-to-text) are available on the 4. Vosk transcription method page.
Step-by-Step Instructions
Clone this repository:
git clone https://github.com/HHousen/lecture2notes.git.Change to project directory:
cd lecture2notes.Run installation command:
conda env create.Activate newly created conda environment:
conda activate lecture2notes.Run wget “https://huggingface.co/HHousen/lecture2notes/resolve/main/Slide%20Classifier%20Median%20Weights/three-category/epoch%3D8.ckpt” -O lecture2notes/end_to_end/model_best.ckpt from the project root to download the slide classification model and put it in the default expected location.
Other Binary Packages: Install
ffmpeg,sox,wget, andpoppler-utilswithsudo apt install ffmpeg sox wget poppler-utilsif on linux. Otherwise, navigate to the sox homepage to downloadsox, the youtube-dl homepage (GitHub) to downloadyoutube-dl, and follow the directions in this StackOverflow answer (Windows) to installpoppler-utilsfor your platform.ffmpegis needed for frame extraction inDatasetandEnd-To-End.soxis needed for automatic audio conversion during the transcription phase ofEnd-To-End. [1]wgetis used to download videos that are not on youtube as part of thevideo_downloaderscraper script inDataset.- End-To-End Process Requirements (Optional)
Spacy: Download the small spacy model by running
python -m spacy download en_core_web_smin the project root. This is required to use certain summarization and similarity features (as discussed above). A spacy model is also required when using spacy as a feature extractor inend_to_end/summarization_approaches.py. [2]DeepSpeech/Vosk: Download the
DeepSpeechmodel (the.pbmmacoustic model and the scorer) from the releases page. To reduce complexity save them todeepspeech-modelsin the project root. [3] Alternatively, it is recommended to download the small vosk model using the commands on the 4. Vosk transcription method page.EAST: Download the
EASTmodel from Dropbox or by runningwget https://huggingface.co/HHousen/lecture2notes/resolve/main/frozen_east_text_detection.pb -O end_to_end/frozen_east_text_detection.pb. If downloading from Dropbox, extract it to theEnd-To-Enddirectory by runningtar -xzvf frozen_east_text_detection.tar.gz -C end_to_end/
- Dataset Collection Requirements (Optional) YouTube API
Run
cp .env.example .envto create a copy of the example.envfile.Add your YouTube API key to your
.envfile.You can now use the scraper scripts to scrape YouTube and create the dataset needed to train the slide classifier.
Transcript Download w/YouTube API (Not Recommended) If you want to download video transcripts with the YouTube API [4], place your
client_secret.jsonin thedataset/scraper-scriptsfolder (if you want to download transcripts with thescraper-scripts) or inEnd-To-End(if you want to download transcripts in the entire end-to-end process that converts a lecture video to notes).
Footnotes