Installation¶
Overview¶
Installation is made easy due to conda environments. Simply run conda env create -f environment.yml
from the root project directory and conda will create an environment called lecture2notes
with all the required packages from environment.yml
.
..note:: Read the paper for more in-depth explanations regarding the background, methodology, and results of this project.
Info About Optional Components¶
Certain functions in the End-To-End transcribe.py
file require additional downloads. If you are not using the transcribe feature of the End-To-End approach then this notice can safely be ignored. These extra files may not be necessary depending on your configuration. To use the similarity function to compare two transcripts a spacy model is needed, which you can learn more about on the spacy starter models and core models documentation.
The default transcription method in the End-To-End process is to use vosk
. You need to download a vosk
model from the models page (Google Drive Mirror) to use this method or you can specify a different method with the --transcription_method
flag such as --transcription_method wav2vec
.
The End-To-End figure_detection.py
contains a function called detect_figures()
. This function requires the EAST (Efficient and Accurate Scene Text Detector) model by default due to the do_text_check
argument defaulting to True
. See the docstring for more information. You can download the model from Dropbox (this link was extracted from the official code) or Google Drive (my mirror). Then just extract the file by running tar -xzvf frozen_east_text_detection.tar.gz
.
Quick-Install (Copy & Paste)¶
git clone https://github.com/HHousen/lecture2notes.git
cd lecture2notes
conda env create
conda activate lecture2notes
python -m spacy download en_core_web_sm
gdown "https://drive.google.com/uc?id=1eXwWQujo_0HVffuUx0Fa6KydjW8h4gUb" -O lecture2notes/end_to_end/model_best.ckpt
Extras (Linux Only):
Install extras only after the above commands have been run.
sudo apt install curl
sudo curl -L https://yt-dl.org/downloads/latest/youtube-dl -o /usr/local/bin/youtube-dl
sudo chmod a+rx /usr/local/bin/youtube-dl
sudo apt install ffmpeg sox wget poppler-utils
Commands to download a Vosk model (needed for speech-to-text) are available on the 4. Vosk transcription method page.
Step-by-Step Instructions¶
Clone this repository:
git clone https://github.com/HHousen/lecture2notes.git
.Change to project directory:
cd lecture2notes
.Run installation command:
conda env create
.Activate newly created conda environment:
conda activate lecture2notes
.Run gdown “https://drive.google.com/uc?id=1eXwWQujo_0HVffuUx0Fa6KydjW8h4gUb” -O lecture2notes/end_to_end/model_best.ckpt from the project root to download the slide classification model and put it in the default expected location.
Other Binary Packages: Install
ffmpeg
,sox
,wget
, andpoppler-utils
withsudo apt install ffmpeg sox wget poppler-utils
if on linux. Otherwise, navigate to the sox homepage to downloadsox
, the youtube-dl homepage (GitHub) to downloadyoutube-dl
, and follow the directions in this StackOverflow answer (Windows) to installpoppler-utils
for your platform.ffmpeg
is needed for frame extraction inDataset
andEnd-To-End
.sox
is needed for automatic audio conversion during the transcription phase ofEnd-To-End
. 1wget
is used to download videos that are not on youtube as part of thevideo_downloader
scraper script inDataset
.- End-To-End Process Requirements (Optional)
Spacy: Download the small spacy model by running
python -m spacy download en_core_web_sm
in the project root. This is required to use certain summarization and similarity features (as discussed above). A spacy model is also required when using spacy as a feature extractor inend_to_end/summarization_approaches.py
. 2DeepSpeech/Vosk: Download the
DeepSpeech
model (the.pbmm
acoustic model and the scorer) from the releases page. To reduce complexity save them todeepspeech-models
in the project root. 3 Alternatively, it is recommended to download the small vosk model using the commands on the 4. Vosk transcription method page.EAST: Download the
EAST
model from Dropbox or by runninggdown https://drive.google.com/uc?id=1ZVn7_g58g4B0QNYNFE6MzRzpirsNTjwe
. Extract it to theEnd-To-End
directory by runningtar -xzvf frozen_east_text_detection.tar.gz -C end_to_end/
- Dataset Collection Requirements (Optional) YouTube API
Run
cp .env.example .env
to create a copy of the example.env
file.Add your YouTube API key to your
.env
file.You can now use the scraper scripts to scrape YouTube and create the dataset needed to train the slide classifier.
Transcript Download w/YouTube API (Not Recommended) If you want to download video transcripts with the YouTube API 4, place your
client_secret.json
in thedataset/scraper-scripts
folder (if you want to download transcripts with thescraper-scripts
) or inEnd-To-End
(if you want to download transcripts in the entire end-to-end process that converts a lecture video to notes).
Footnotes
- 1
If your audio is 16000Hz, 1 channel, and
.wav
format, thensox
is not needed.- 2
The default is not to use spacy for feature extraction but the large model (which can be downloaded with
python -m spacy download en_core_web_lg
) is the default if spacy is manually chosen. So make sure to download the large model if you want to use spacy for feature extraction.- 3
Folder name and location do not matter. Just make sure the scorer and model are in the same directory. The scripts will automatically detect each when given the path to the folder containing them.
- 4
The default is to use
youtube-dl
which needs no API key.