Read_csv error when accessing directly from the website

The dataset can be directly accessed with the link (Yes, read_csv accepts links too!):

ParserError Traceback (most recent call last)
in ()
3 # greenhouse_data= pd.read_html(url)[1]
4 # greenhouse_data.head()
----> 5 greenhouse_data= pd.read_csv(url)

3 frames
/usr/local/lib/python3.6/dist-packages/pandas/io/parsers.py in read(self, nrows)
2155 def read(self, nrows=None):
2156 try:
-> 2157 data = self._reader.read(nrows)
2158 except StopIteration:
2159 if self._first_chunk:

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader.read()

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._read_low_memory()

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._read_rows()

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._tokenize_rows()

pandas/_libs/parsers.pyx in pandas._libs.parsers.raise_parser_error()

ParserError: Error tokenizing data. C error: Expected 1 fields in line 7, saw 3

I think they have fixed the link. Please do check again to see if it works.

If it doesn’t work, there’s an option to download the file in specific formats(select CSV in it) and you can access by reading it directly from your system

2 Likes

The link did not work for me either. I followed the link, then clicked on “RAW”, then used that link (https://raw.githubusercontent.com/dphi-official/Datasets/master/Standard_Metropolitan_Areas_Data-data.csv).

The raw link works fine.

1 Like

@balaleo use read_html from pandas to parse all HTML tables on the page. There are only 3 tables accessed and the required table can be found manually in that set.

1 Like