Blog
Copyright © 2019 Jiri Kriz, www.nosco.ch

BBC 6 Minute English

Comments (0)
Links and tipps for listening and download of BBC 6 Minute English.

Listen and download

BBC publishes an excellent show "6 Minute English" for English learners. You can find the broadcasts on:

Unfortunately, the links to the download pages (.mp3, .pdf) are presented in a nonuniform way so it is not easy to download the shows programmatically. Here you will find the links to all BBC programs "6 Minute English" represented in a uniform way. Click on a year to see the programs of that year. Click again to hide the programs. You can use the source code of this page to write a program to download all shows.

A Python program is provided as an example of a bulk download.

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

Download using Python

# Call download("180101", "180331")
# to download the podcasts from 2018-01-01 to 2018-03-31

import sys
import urllib.request
import re


def download(date_start, date_end): 
    links = getLinks(date_start, date_end)
    for link in links:
        (date, name, pageUrl, mp3Url, pdfUrl) = link
        date_name = date + "_" + name
        print(date_name)
        downloadUrl(mp3Url, date_name + ".mp3")
        downloadUrl(pdfUrl, date_name + ".pdf")
        

def downloadUrl(url, downloadName):
    opener = urllib.request.URLopener()
    opener.retrieve(url, downloadName)


def getLinks(date_start, date_end):
    # read the content of this page and get URLs of the podcasts
    
    url = "http://www.nosco.ch/blog/de/2018/03/bbc-6min-english"
    f = urllib.request.urlopen(url)
    lines = f.readlines()
    f.close()
    
    oldest_date = "080312"
    links = []    
    for utfLine in lines:
        line = utfLine.decode("utf8")
        line = line.strip()
        if "'bbc_date" in line:
            searchObj = re.search("(\d+)", line)
            date = searchObj.group(1)
        elif "'bbc_page" in line:
            searchObj = re.search("href='(.*?)'", line)
            page = searchObj.group(1)           
        elif "'bbc_mp3" in line:
            searchObj = re.search("href='(.*?)'", line)
            mp3 = searchObj.group(1)            
        elif "'bbc_pdf" in line:
            searchObj = re.search("href='(.*?)'", line)
            pdf = searchObj.group(1)            
        elif "'bbc_name" in line:
            searchObj = re.search(">(.*?)<", line)
            name = searchObj.group(1)   
            if date >= date_start and date <= date_end:
                info = (date, name, page, mp3, pdf)
                links.append(info)  
            if date == oldest_date:
                break
    return links

Comments

New Comment