For some reason I had to delete some pages, these pages are using the HTML suffix, so I blocked them in robots.txt use Disallow: /*.html, but it’s been almost a year, I found that google robot often capture these pages, How can I quickly let Google completely remove these pages? And I have removed these URL from google webmaster tool by google index-> remove URLs, but Google still capture these pages.
Another example when the “nofollow" attribute can come handy are widget links. If you are using a third party's widget to enrich the experience of your site and engage users, check if it contains any links that you did not intend to place on your site along with the widget. Some widgets may add links to your site which are not your editorial choice and contain anchor text that you as a webmaster may not control. If removing such unwanted links from the widget is not possible, you can always disable them with “nofollow" attribute. If you create a widget for functionality or content that you provide, make sure to include the nofollow on links in the default code snippet.
In February 2011, Google announced the Panda update, which penalizes websites containing content duplicated from other websites and sources. Historically websites have copied content from one another and benefited in search engine rankings by engaging in this practice. However, Google implemented a new system which punishes sites whose content is not unique. The 2012 Google Penguin attempted to penalize websites that used manipulative techniques to improve their rankings on the search engine. Although Google Penguin has been presented as an algorithm aimed at fighting web spam, it really focuses on spammy links by gauging the quality of the sites the links are coming from. The 2013 Google Hummingbird update featured an algorithm change designed to improve Google's natural language processing and semantic understanding of web pages. Hummingbird's language processing system falls under the newly recognized term of 'Conversational Search' where the system pays more attention to each word in the query in order to better match the pages to the meaning of the query rather than a few words . With regards to the changes made to search engine optimization, for content publishers and writers, Hummingbird is intended to resolve issues by getting rid of irrelevant content and spam, allowing Google to produce high-quality content and rely on them to be 'trusted' authors.
Lastly, it's important to remember that paralysis by over-thinking is a real issue some struggle with. There's no pill for it (yet). Predicting perfection is a fool's errand. Get as close as you can within a reasonable timeframe, and prepare for future iteration. If you're traveling through your plan and determine a soft spot at any time, simply pivot. It's many hours of upfront work to get your strategy built, but it's not too hard to tweak as you go.