But these two lines give me the headers of my table's columns.
   itemlist = soup.find('table', id=True)
   #gives me the only table with an ID
   headers = itemlist.findAll('th')
   #gives me the headers of that table.

and to parse the table rows with recursing through the nested tables.
   rows = mytable.findAll('td', recursive=False)

In the HTML CSS parser - you want to look at tagsNamed:

for instance - dom tagsNamed: 'table'

will return a collection of table nodes that are children of the receiver.

Look at the implementation of that in HtmlDOMNode - it uses a method called nodesCollect:

that will take an arbitrary block and returns all subnodes for which the block evaluates to true. It is very similar.

The html is broken and has hundreds of tables. There are something like
6 nested tables in each of the primary tables rows. This is from a MS
SharePoint website. The markup is awful.

HtmlCSSParser was designed to deal with just such markup (and tries to explain what is wrong with it).

I'm sure there is an easy way in Squeak to do the above, but I haven't
spent enough time to master it.

A problem I've had with both of the above and which makes them a problem
for me, is that they have both popped up modal dialogs which I had to
click on in order to proceed.

They have fairly different APIs.

The HTML-Parser popped up a box for every tag without a closing tag.
The Html+CSS Validator popped a box it seemed when it couldn't connect
to a site. I guess it was attempting to retrieve the CSS, in order to
validate?

That would be the underlying transport layer - HtmlCSSParser never tries to interact with the user.

You don't have to validate.

dom := (HtmlValidator onUrl: 'http://something.com') dom.

Cheers,

-Todd