One year page review

with whiskynote.be data

Author

Tony Duan

Code

import requests
import os
from bs4 import BeautifulSoup
import pandas as pd
import time

Code

os.system('pip show beautifulsoup4')

Web scraping on www.whiskynotes.be

1 year page

Code

year_ur='https://www.whiskynotes.be/2023'

2 read in html

Code

# Send an HTTP GET request to the website
headers = {'User-Agent': 'My User Agent'}
response = requests.get(year_ur,headers=headers)

Code

# success code - 200 
print(response)

Code

#print(response.content)

Code

# Parse the HTML code using BeautifulSoup
soup = BeautifulSoup(response.content, 'html.parser')

3 review bottle name on one year

Code

bottle001=soup.find_all('p')

Code

for i in bottle001[1:5]:
  i.get_text()

4 review topic name on one year

Code

topic001=soup.select('.archive-link')

Code

for i in topic001[1:5]:
  i.get_text()

Code

topic_link=soup.select('.entry-permalink')

for link in topic_link[1:5]:
  link.get('href')

5 reference:

---
title: "One year page review"
subtitle: "with whiskynote.be data"
author: "Tony Duan"

execute:
  warning: false
  error: false
  eval: false
format:
  html:
    toc: true
    toc-location: right
    code-fold: show
    code-tools: true
    number-sections: true
    code-block-bg: true
    code-block-border-left: "#31BAE9"
---

```{python}
import requests
import os
from bs4 import BeautifulSoup
import pandas as pd
import time
```

```{python}
os.system('pip show beautifulsoup4')
```

Web scraping on www.whiskynotes.be

# year page

```{python}
year_ur='https://www.whiskynotes.be/2023'
```

![](images/clipboard-677064128.png)

# read in html


```{python}
# Send an HTTP GET request to the website
headers = {'User-Agent': 'My User Agent'}
response = requests.get(year_ur,headers=headers)
```


```{python}
# success code - 200 
print(response) 
```

```{python}
#print(response.content)
```


```{python}
# Parse the HTML code using BeautifulSoup
soup = BeautifulSoup(response.content, 'html.parser')
```


# review bottle name on one year

```{python}
bottle001=soup.find_all('p')
```

```{python}
for i in bottle001[1:5]:
  i.get_text()
```


# review topic name on one year

```{python}
topic001=soup.select('.archive-link')
```

```{python}
for i in topic001[1:5]:
  i.get_text()
```

```{python}
topic_link=soup.select('.entry-permalink')

for link in topic_link[1:5]:
  link.get('href')
  
```


# reference: