If google cannot see the content, it assumes it doesn't exist; by setting your robots.txt to not allow them to crawl the data, you're telling them it does not exist there.
As for the limit they allow, again I don't believe this is the case, especially since they support multiple sitemaps, sitemap indexes, and have a 50k link limit.
As for sitemaps; a sitemap generator like the one I sell for the UBB should populate the links from threads automatically, including an index for going over a limit.