Software-Engineering
# PDF
In most cases, processed data is saved as a pdf-file. When extracting data from a pdf-file we use the PyPDF2
library.
# HTML
When extracting data from the WWW, we need 2 libraries: Beautifulsoup
, urllib
.
# Basics
# urllib
# BeautifulSoup