Classify URL

POST classify/url

Classify the submitted URL and return scored categories and keywords.

The Classify URL call honors the robots.txt file from the site of the specified URL. In the case that the specified URL is blocked by robots.txt, a 403 error will be returned to the user with an error message indicating same.

The Classify URL call also requires that the URL being examined must adhere to a content-type and content-length standard. The Content-type header for the URL must be one of the following: text/plain, text/html, text/xhtml, application/xhtml+xml, text/xml, application/xml. The Content-length header must present a value less than or equal to 256000 bytes.

Parameters

Parameter Type Description
url (required) string A fully qualified URL to be retrieved and classified.
classification_type (optional) integer Select the classification method: 1 for rule-based, 2 for model-based, or 0 for a hybrid rule-based + model-based (defaults to 0)
ml_threshold (optional) float Specify a confidence threshold for accepting an ML prediction. A lower value increases recall at the expense of precision (defaults to 0.75)
cache_skip (optional) boolean Rescrape a URL for HTML content rather than using a possibly cached scrape (defaults to false)
entities (optional) boolean Perform Named Entity Recognition (NER) on the content submitted (defaults to false)
sentiment (optional) boolean Perform sentiment analysis on the content submitted (defaults to false)
min_tags (optional) integer eContext uses a smart parsing library to extract only the most relevant content from a webpage, and ignore areas likely to be less relevant (navigation, footers, etc). However, for some pages this may result in less content extracted than expected. Use this parameter to set a minimum number of HTML tags the smart library must extract; if the result is less than this minimum, eContext will extract content from all HTML tags (eg, a full-page parse).
taxonomy_timestamp (optional) integer A Unix timestamp instructing the classifier to use categories from the eContext Taxonomy that existed at this point in time. This will allow recently deleted categories to remain and hides newly created categories
dataset_id (optional) string A Custom Taxonomies id to use in lieu of the default eContext Taxonomy
add_last_node (optional) bool Include the last category node, or leave at the parent category
classify_limit (optional) integer Limit the number of categories that may be returned per post
classify_timeout (optional) float The number of seconds to spend on a classification task

Return

The result set includes scored_categories and scored_keywords as well as a categories dictionary. The scored_keywords object contains a list of high-value phrases that eContext was able to pull out of the submitted text as well as associated scores for each. The scored_categories object contains a list of category_id and score objects where the category_id corresponds to an item in the categories dictionary. Higher values indicate a higher score.

Example Request

POST Request

curl -X POST -u username:password --data-binary @classify-url-input.json \
--header "Content-type: application/json" \
https://api.econtext.com/v2/classify/url

The contents of classify-url-input.json:

{
    "async": false,
    "url":"http://topics.info.com/Parks_4679"
}

POST Response

{
    "econtext": {
        "classify": {
            "title": "Semantic Text Classification | eContext Taxonomy and Data Structure |",
            "scored_categories": [
                {
                    "category_id": "6981e993569ba5af3cc14d7c3a05fc76",
                    "score": 0.66405638214565
                },
                {
                    "category_id": "1a9587f016d90b4cfb0c473039c98f3a",
                    "score": 0.065779169929522
                },
                {
                    "category_id": "b2ba2d3dc01ec7dea4263ebd882f9e86",
                    "score": 0.043852779953015
                },
                {
                    "category_id": "c29f157a195759a39cc27e7c540cd4d9",
                    "score": 0.043852779953015
                },
                {
                    "category_id": "9bfd0bb46baa2306a253f745cec5b1f7",
                    "score": 0.037588097102584
                },
                {
                    "category_id": "71a84c532ab6124052e5b92f85dc5dd8",
                    "score": 0.034455755677369
                },
                {
                    "category_id": "34d5bf8766845bf437fe3c69663692f2",
                    "score": 0.027407987470634
                },
                {
                    "category_id": "fc2b4234fdf6b60d01daa7e34d8b5bae",
                    "score": 0.02662490211433
                },
                {
                    "category_id": "6c7d1b5fbb00b32a1132007c9c851995",
                    "score": 0.025058731401723
                }
            ],
            "scored_keywords": [
                {
                    "keyword": "econtext",
                    "score": 0.63082437275986
                },
                {
                    "keyword": "video",
                    "score": 0.075268817204301
                },
                {
                    "keyword": "chatbots",
                    "score": 0.050179211469534
                },
                {
                    "keyword": "surveys",
                    "score": 0.043010752688172
                },
                {
                    "keyword": "keywords",
                    "score": 0.039426523297491
                },
                {
                    "keyword": "econtext's",
                    "score": 0.039426523297491
                },
                {
                    "keyword": "kantar",
                    "score": 0.028673835125448
                },
                {
                    "keyword": "scientist",
                    "score": 0.028673835125448
                },
                {
                    "keyword": "taxonomy",
                    "score": 0.028673835125448
                }
            ],
            "categories": {
                "6981e993569ba5af3cc14d7c3a05fc76": {
                    "id": "6981e993569ba5af3cc14d7c3a05fc76",
                    "name": "eContext",
                    "path": [
                        "Business & Industrial",
                        "Advertising & Marketing",
                        "Advertising & Marketing Services",
                        "Internet Advertising & Marketing",
                        "Internet Advertising & Marketing Tools",
                        "eContext"
                    ],
                    "idpath": [
                        "93ae18acd5845912d0719cf14e34fff0",
                        "4a90604ecbb8e54663f84f59ce4350c1",
                        "ab0630bafba120dedbb788c0b8d33091",
                        "56ee67cc2683e5c1e5dcf113f835fddb",
                        "e6229d3e428212d041b432f89399871a",
                        "6981e993569ba5af3cc14d7c3a05fc76"
                    ],
                    "stats": {
                        "social_relevance": 6.42e-8,
                        "social_idf": 16.3422325487
                    },
                    "facets": [
                        [
                            "domain",
                            "service"
                        ]
                    ]
                },
                "1a9587f016d90b4cfb0c473039c98f3a": {
                    "id": "1a9587f016d90b4cfb0c473039c98f3a",
                    "name": "Video & Live Media Streaming",
                    "path": [
                        "Computers & Electronics",
                        "Telecommunications",
                        "Internet",
                        "Websites & Digital Content",
                        "File Hosting & Sharing",
                        "Video & Live Media Streaming"
                    ],
                    "idpath": [
                        "bdc03d860e5f33c08146faa43487c1bd",
                        "2712a67ea6c5398779d806a7a5f016eb",
                        "bbd7a35fae11c6cde461e75bd99e1b1a",
                        "78971b721e12d951c071b2e3d01c74e8",
                        "25bd7afbfe29570a835b986b68518d79",
                        "1a9587f016d90b4cfb0c473039c98f3a"
                    ],
                    "stats": {
                        "social_relevance": 0.003922828,
                        "social_idf": 5.3221781922
                    },
                    "facets": [
                        [
                            "domain",
                            "facility"
                        ],
                        [
                            "domain",
                            "service"
                        ]
                    ]
                },
                "b2ba2d3dc01ec7dea4263ebd882f9e86": {
                    "id": "b2ba2d3dc01ec7dea4263ebd882f9e86",
                    "name": "Chatbots & Conversational Platforms",
                    "path": [
                        "Computers & Electronics",
                        "Computers",
                        "Computer Products",
                        "Computer Software",
                        "Apps",
                        "Application Software",
                        "Communications Software",
                        "Messaging Software",
                        "Chatbots & Conversational Platforms"
                    ],
                    "idpath": [
                        "bdc03d860e5f33c08146faa43487c1bd",
                        "ed62d0b6672e5addd702fd780ccd185d",
                        "40cf7c7334801a84c1c52166595e3d7e",
                        "3f1ff940a8bdeb0c9804a879f88f598e",
                        "e4782eb0f978ded90481e2b177ead9c4",
                        "3295dd2a46d67ca4c723c481dac6ed5f",
                        "2f15d4c34b296dee69c2fab67cfe11e6",
                        "d039be7ef56e1a2d07e6dfb7509052ba",
                        "b2ba2d3dc01ec7dea4263ebd882f9e86"
                    ],
                    "stats": {
                        "social_relevance": 1.29076e-5,
                        "social_idf": 11.0389276407
                    },
                    "facets": []
                },
                "71a84c532ab6124052e5b92f85dc5dd8": {
                    "id": "71a84c532ab6124052e5b92f85dc5dd8",
                    "name": "Keywords",
                    "path": [
                        "Business & Industrial",
                        "Advertising & Marketing",
                        "Advertising & Marketing Services",
                        "Internet Advertising & Marketing",
                        "Internet Advertising & Marketing [No Strategy Specified]",
                        "Keywords"
                    ],
                    "idpath": [
                        "93ae18acd5845912d0719cf14e34fff0",
                        "4a90604ecbb8e54663f84f59ce4350c1",
                        "ab0630bafba120dedbb788c0b8d33091",
                        "56ee67cc2683e5c1e5dcf113f835fddb",
                        "c4b046136fcb9ae7bc9df7a1b4f6afe3",
                        "71a84c532ab6124052e5b92f85dc5dd8"
                    ],
                    "stats": {
                        "social_relevance": 2.89619e-5,
                        "social_idf": 10.2307652092
                    },
                    "facets": []
                },
                "c29f157a195759a39cc27e7c540cd4d9": {
                    "id": "c29f157a195759a39cc27e7c540cd4d9",
                    "name": "Scientists",
                    "path": [
                        "Sciences & Humanities",
                        "Science",
                        "Science [No Branch Specified]",
                        "Scientists"
                    ],
                    "idpath": [
                        "9c15c34150b7e723fea0eb4b12878947",
                        "9954bdf75b1d9c9abde66f5fa8d8754f",
                        "3b54651274fceca46f708592533817b4",
                        "c29f157a195759a39cc27e7c540cd4d9"
                    ],
                    "stats": {
                        "social_relevance": 0.0001907889,
                        "social_idf": 8.3455786733
                    },
                    "facets": []
                },
                "fc2b4234fdf6b60d01daa7e34d8b5bae": {
                    "id": "fc2b4234fdf6b60d01daa7e34d8b5bae",
                    "name": "Publicis Groupe",
                    "path": [
                        "Business & Industrial",
                        "Advertising & Marketing",
                        "Advertising & Marketing Services",
                        "Advertising & Marketing Services [No Media Type Specified]",
                        "Advertising & Marketing Services [No Industry or Demographic Specified]",
                        "Advertising & Marketing Agencies",
                        "Publicis Groupe"
                    ],
                    "idpath": [
                        "93ae18acd5845912d0719cf14e34fff0",
                        "4a90604ecbb8e54663f84f59ce4350c1",
                        "ab0630bafba120dedbb788c0b8d33091",
                        "014c8b620691495410e338d79143c579",
                        "8bc5d713c6c3637d379d216d56e36a6e",
                        "12bcf2eeb20bbcb300f78e06378d5df9",
                        "fc2b4234fdf6b60d01daa7e34d8b5bae"
                    ],
                    "stats": {
                        "social_relevance": 1.2201e-6,
                        "social_idf": 13.3977935696
                    },
                    "facets": []
                },
                "6c7d1b5fbb00b32a1132007c9c851995": {
                    "id": "6c7d1b5fbb00b32a1132007c9c851995",
                    "name": "Kantar",
                    "path": [
                        "Business & Industrial",
                        "Advertising & Marketing",
                        "Advertising & Marketing Services",
                        "Advertising & Marketing Services [No Media Type Specified]",
                        "Advertising & Marketing Services [No Industry or Demographic Specified]",
                        "Advertising & Marketing Agencies",
                        "WPP",
                        "Kantar"
                    ],
                    "idpath": [
                        "93ae18acd5845912d0719cf14e34fff0",
                        "4a90604ecbb8e54663f84f59ce4350c1",
                        "ab0630bafba120dedbb788c0b8d33091",
                        "014c8b620691495410e338d79143c579",
                        "8bc5d713c6c3637d379d216d56e36a6e",
                        "12bcf2eeb20bbcb300f78e06378d5df9",
                        "8b3342227d08018b0171b0c661b9f996",
                        "6c7d1b5fbb00b32a1132007c9c851995"
                    ],
                    "stats": {
                        "social_relevance": 5.78e-7,
                        "social_idf": 14.1450079714
                    },
                    "facets": []
                },
                "34d5bf8766845bf437fe3c69663692f2": {
                    "id": "34d5bf8766845bf437fe3c69663692f2",
                    "name": "Customer Service",
                    "path": [
                        "Business & Industrial",
                        "General Business & Industrial",
                        "General Business & Industrial Services",
                        "General Business Services",
                        "Business Operations, Management, & Support Services",
                        "Business Operations & Management",
                        "Customer Relations",
                        "Customer Service"
                    ],
                    "idpath": [
                        "93ae18acd5845912d0719cf14e34fff0",
                        "85223b2c100418dea4b61c33ca47f862",
                        "63ad16a5babfd448801877882fee0516",
                        "60876790b7febe9ceaa6cc623cad9c20",
                        "71a8e104f880da61fb5cc3db0e10ec3c",
                        "a42adf2a895dd4d471b9391e072fc687",
                        "8599dc1f0e4ecf0e18102c9d18f42363",
                        "34d5bf8766845bf437fe3c69663692f2"
                    ],
                    "stats": {
                        "social_relevance": 0.0002349703,
                        "social_idf": 8.1372873837
                    },
                    "facets": []
                },
                "9bfd0bb46baa2306a253f745cec5b1f7": {
                    "id": "9bfd0bb46baa2306a253f745cec5b1f7",
                    "name": "Surveys",
                    "path": [
                        "Sciences & Humanities",
                        "Science",
                        "Social Sciences",
                        "Sociology",
                        "Sociological Research Methods",
                        "Surveys"
                    ],
                    "idpath": [
                        "9c15c34150b7e723fea0eb4b12878947",
                        "9954bdf75b1d9c9abde66f5fa8d8754f",
                        "c270f6632e37fe26329d9af4a515122c",
                        "f3df44a5265aff3cfda548122add9271",
                        "58a43bc17febd6642724d8acafec7275",
                        "9bfd0bb46baa2306a253f745cec5b1f7"
                    ],
                    "stats": {
                        "social_relevance": 0.0001555337,
                        "social_idf": 8.5498836246
                    },
                    "facets": []
                }
            },
            "entities": [],
            "sentiment": 0.61099622641509,
            "chars": 5663,
            "overlay": {
                "6981e993569ba5af3cc14d7c3a05fc76": {
                    "IAB_v2.0_2018": [
                        [
                            [
                                "52",
                                "Business and Finance"
                            ],
                            [
                                "53",
                                "Business and Finance::Business"
                            ]
                        ],
                        [
                            [
                                "90",
                                "Business and Finance::Industries"
                            ],
                            [
                                "91",
                                "Business and Finance::Industries::Advertising Industry"
                            ],
                            [
                                "58",
                                "Business and Finance::Business::Marketing and Advertising"
                            ]
                        ],
                        [
                            [
                                "602",
                                "Technology & Computing::Computing::Computer Software and Applications"
                            ]
                        ]
                    ]
                },
                "1a9587f016d90b4cfb0c473039c98f3a": {
                    "IAB_v2.0_2018": [
                        [
                            [
                                "632",
                                "Technology & Computing::Consumer Electronics"
                            ],
                            [
                                "596",
                                "Technology & Computing"
                            ]
                        ],
                        [
                            [
                                "116",
                                "Business and Finance::Industries::Telecommunications Industry"
                            ]
                        ],
                        [
                            [
                                "619",
                                "Technology & Computing::Computing::Internet"
                            ]
                        ]
                    ]
                },
                "b2ba2d3dc01ec7dea4263ebd882f9e86": {
                    "IAB_v2.0_2018": [
                        [
                            [
                                "632",
                                "Technology & Computing::Consumer Electronics"
                            ],
                            [
                                "596",
                                "Technology & Computing"
                            ]
                        ],
                        [
                            [
                                "599",
                                "Technology & Computing::Computing"
                            ]
                        ],
                        [
                            [
                                "602",
                                "Technology & Computing::Computing::Computer Software and Applications"
                            ]
                        ]
                    ]
                },
                "71a84c532ab6124052e5b92f85dc5dd8": {
                    "IAB_v2.0_2018": [
                        [
                            [
                                "52",
                                "Business and Finance"
                            ],
                            [
                                "53",
                                "Business and Finance::Business"
                            ]
                        ],
                        [
                            [
                                "90",
                                "Business and Finance::Industries"
                            ],
                            [
                                "91",
                                "Business and Finance::Industries::Advertising Industry"
                            ],
                            [
                                "58",
                                "Business and Finance::Business::Marketing and Advertising"
                            ]
                        ]
                    ]
                },
                "c29f157a195759a39cc27e7c540cd4d9": {
                    "IAB_v2.0_2018": [
                        [
                            [
                                "464",
                                "Science"
                            ]
                        ]
                    ]
                },
                "fc2b4234fdf6b60d01daa7e34d8b5bae": {
                    "IAB_v2.0_2018": [
                        [
                            [
                                "52",
                                "Business and Finance"
                            ],
                            [
                                "53",
                                "Business and Finance::Business"
                            ]
                        ],
                        [
                            [
                                "90",
                                "Business and Finance::Industries"
                            ],
                            [
                                "91",
                                "Business and Finance::Industries::Advertising Industry"
                            ],
                            [
                                "58",
                                "Business and Finance::Business::Marketing and Advertising"
                            ]
                        ]
                    ]
                },
                "6c7d1b5fbb00b32a1132007c9c851995": {
                    "IAB_v2.0_2018": [
                        [
                            [
                                "52",
                                "Business and Finance"
                            ],
                            [
                                "53",
                                "Business and Finance::Business"
                            ]
                        ],
                        [
                            [
                                "90",
                                "Business and Finance::Industries"
                            ],
                            [
                                "91",
                                "Business and Finance::Industries::Advertising Industry"
                            ],
                            [
                                "58",
                                "Business and Finance::Business::Marketing and Advertising"
                            ]
                        ]
                    ]
                },
                "34d5bf8766845bf437fe3c69663692f2": {
                    "IAB_v2.0_2018": [
                        [
                            [
                                "52",
                                "Business and Finance"
                            ],
                            [
                                "53",
                                "Business and Finance::Business"
                            ]
                        ],
                        [
                            [
                                "62",
                                "Business and Finance::Business::Business Administration"
                            ],
                            [
                                "73",
                                "Business and Finance::Business::Business Operations"
                            ]
                        ],
                        [
                            [
                                "76",
                                "Business and Finance::Business::Executive Leadership & Management"
                            ]
                        ],
                        [
                            [
                                "74",
                                "Business and Finance::Business::Consumer Issues"
                            ]
                        ]
                    ]
                },
                "9bfd0bb46baa2306a253f745cec5b1f7": {
                    "IAB_v2.0_2018": [
                        [
                            [
                                "464",
                                "Science"
                            ]
                        ]
                    ]
                }
            }
        },
        "signature": {
            "resource": "POST \/classify\/:type\/:result_id",
            "status": "200 OK - successful",
            "client_ip": "209.41.117.158"
        }
    }
}