How to get Google Indexed Pages Count, and/or do Google searches through AJAX/programmatically

27 / Sep / 2011 by Kushal Likhi 8 comments

Hi,
Recently i had to get the count of the Pages Google has Indexed for a perticular web Site progmatically.
In the process i found that Google offers an AJAX Search API to perform Google Searches.
.
Hence i Implemented a Class which does the same for me easily(Ajax Search in Google).
And this Solution Does Google Searches programmatically and Also Give us the Indexed Pages Count.
Note: Its Groovy Implementation, Similar can be done in JavaScript using Ajax though intent will remain the same.
The class has been named GoogleAjaxSearch because it uses the Google AJAX Search API ;)

Use Cases Are As Follows:

//Case 1 Simple Search
String searchQuery = "my search Query"
GoogleAjaxSearch googleAjaxSearch = new GoogleAjaxSearch(searchQuery)
println googleAjaxSearch.results //Type List<Expando>, Just print it to see available properties

//Case 2: Get Number Of Indexed Pages in Google
String searchQuery = "site:intelligrape.com"
GoogleAjaxSearch googleAjaxSearch = new GoogleAjaxSearch(searchQuery)
println googleAjaxSearch.estimatedResultCount

//Case 3: Redo a search with same SearchQuery or a different Search Query
String searchQuery = "search string"
GoogleAjaxSearch googleAjaxSearch = new GoogleAjaxSearch(searchQuery)
googleAjaxSearch.redo() //Re-Search with same query
googleAjaxSearch.redo("New Query") // Re-Search With New Query

//Case 4: All Details Available
String searchQuery = "site:intelligrape.com"
GoogleAjaxSearch googleAjaxSearch = new GoogleAjaxSearch(searchQuery)
println googleAjaxSearch.estimatedResultCount //Gives the result count, indexed pages count
println googleAjaxSearch.currentPageIndex    //Search result server side pagination - current page
println googleAjaxSearch.results  //List<Expando> containing results
println googleAjaxSearch.responseStatus //HTTP Status Code
println googleAjaxSearch.responseDetails //Details for response, if any
println googleAjaxSearch.searchQuery     //Search Query 
println googleAjaxSearch.moreResultsUrl // Url to hit for next Page of results
println googleAjaxSearch.jsonResult // Complete raw JSON Result from google
println googleAjaxSearch.pages  // Pagination Pages Details

The Class Code Is As Follows:


package myPackage.googleSearch

import grails.converters.JSON

class GoogleAjaxSearch {

    public String searchQuery
    public String responseDetails
    public String moreResultsUrl
    public String jsonResult

    public Integer responseStatus
    public Integer currentPageIndex
    public Integer estimatedResultCount

    public def pages

    public List<Expando> results = []

    private static final String ajaxSearchTarget = "http://ajax.googleapis.com/ajax/services/search/web?v=1.0&q=###SEARCHQUERY###&filter=0"

    public GoogleAjaxSearch(String searchQuery) {
        this.searchQuery = searchQuery
        search()
    }

    public GoogleAjaxSearch() {
        this(null)
    }

    public void redo() {
        search()
    }

    public void redo(String newSearchQuery) {
        this.searchQuery = newSearchQuery
        search()
    }

    private void search() {
        if (searchQuery) {
            URL url = new URL(ajaxSearchTarget.replace('###SEARCHQUERY###', searchQuery))
            URLConnection connection = url.openConnection()
            connection.setDoInput(true)
            InputStream inStream = connection.getInputStream()
            BufferedReader searchResultContent = new BufferedReader(new InputStreamReader(inStream))
            jsonResult = searchResultContent.getText()
            parseJsonAndPopulateObject()
        }
    }

    private void parseJsonAndPopulateObject() {
        def jsonArray = JSON.parse(jsonResult)
        this.responseStatus = Integer.parseInt(jsonArray.responseStatus as String, 10)
        this.responseDetails = jsonArray.responseDetails
        this.moreResultsUrl = jsonArray.responseData.cursor.moreResultsUrl
        this.currentPageIndex = Integer.parseInt(jsonArray.responseData.cursor.currentPageIndex as String, 10)
        this.pages = jsonArray.responseData.cursor.pages
        this.estimatedResultCount = Integer.parseInt(jsonArray.responseData.cursor.estimatedResultCount as String, 10)
        results = jsonArray.responseData.results.collect {
            new Expando(
                    content: it.content,
                    GsearchResultClass: it.GsearchResultClass,
                    titleNoFormatting: it.titleNoFormatting,
                    title: it.title,
                    cacheUrl: it.cacheUrl,
                    unescapedUrl: it.unescapedUrl,
                    url: it.url,
                    visibleUrl: it.visibleUrl
            )
        }
    }
}

Hope That Helps :)
Regards
Kushal Likhi

Tag -

FOUND THIS USEFUL? SHARE IT

comments (8)

  1. deck hearthstone cheat

    It’s going to be ending of mine day, however before
    finish I am reading this enormous paragraph to improve my experience.

    Reply
  2. Ilene

    Hey, I think your site might be having browser compatibility issues.
    When I look at your blog site in Ie, it looks fine but when olening in Internet Explorer,
    it has some overlapping. I just wanted to gife you a quick heads up!
    Other thdn that, amazing blog!

    Reply
  3. best website promotion

    Wow, that’s what I was looking for, what a material! present here at this weblog, thanks admin of this website.

    Reply
  4. BEST TUMBLR BLOG

    Does your site have a contact page? I’m having a tough time locating it but, I’d like to shoot you an email. I’ve got some recommendations for your blog you might be interested in hearing. Either way, great blog and I look forward to seeing it improve over time.

    Reply
  5. Kushal

    Hi,
    The Beauty of groovy is that it is java, but java is not groovy, The class above will work, you have to only remove some groovy enhancements:
    1) Change Expando Class to HashMap Class
    2) change the implementation of parseJsonAndPopulateObject() method to support some Java available JSON Parser.
    3) Some imports on the top
    4) Add semi-colan at end of each line :)
    Rest all should be the same….
    .
    OR
    You can compile the above code using groovy compiler and generate a JAR, add that jar in your project.. :) , and yes then you need to also add the groovy-JAR in the project.
    .
    Then you will be able to access it like a standard JAVA class..
    .
    If you want i can generate a jar with packed groovy dependencies, then you need to add that jar in your project and all good… P.S that jar will be around 5MB :)

    Reply
  6. Nass

    Hi,
    it’s very ineresting.
    I’m searching about Google Indexed Pages Count and search count programming in java.
    Any helps?
    Thanks a lot

    Reply

Leave a comment -