libtld 1.2.0
|
Go to the source code of this file.
#define LIBTLD_VERSION "@LIBTLD_VERSION_MAJOR@.@LIBTLD_VERSION_MINOR@.@LIBTLD_VERSION_PATCH@" |
Definition at line 28 of file tld.h.in.
Referenced by tld_version().
enum tld_category |
Defines the category of the TLD. The most well known categories are International TLDs (such as .com and .info) and the countries TLDs (such as .us, .uk, .fr, etc.)
IANA offers and is working on other extensions such as .pro for profesionals, and .arpa for their internal infrastructure.
enum tld_result |
This enumeration defines all the possible results of the tld() function.
Only the TLD_RESULT_SUCCESS is considered to represent a valid result.
The TLD_RESULT_INVALID represents a TLD that was found but is not currently marked as valid (it may be deprecated or proposed, for example.)
TLD_RESULT_SUCCESS |
Success! The TLD of the specified URI is valid. This result is returned when the URI includes a valid TLD. The function further includes valid results in the tld_info structure. You can accept this URI as valid. |
TLD_RESULT_INVALID |
The TLD was found, but it is marked as invalid. This result represents a TLD that is not valid as is for a URI, but it was defined in the TLD data. The function includes further information in the tld_info structure. There you can check the category, status, and other parameters to determine what the TLD really represents. It may be possible to use such a TLD, although as far as web addresses are concerned, these are not considered valid. As mentioned in the statuses, some may mean that the TLD can be changed for another and work (i.e. a country name that changed.) |
TLD_RESULT_NULL |
The input URI is empty. The tld() function returns this value whenever the input URI pointer is NULL or the empty string (""). Obviously, no TLD is found in this case. |
TLD_RESULT_NO_TLD |
The input URI has no TLD defined. Whenever the URI does not include at least one period (.), this error is returned. Local URIs are considered valid and don't generally include a period (i.e. "localhost", "my-computer", "johns-computer", etc.) We expect that the tld() function would not be called with such URIs. A valid Internet URI must include a TLD. |
TLD_RESULT_BAD_URI |
The URI includes characters that are not accepted by the function. This value is returned if a character is found to be incompatible or a sequence of characters is found incompatible. At this time, tld() returns this error if two periods (.) are found one after another. The errors will be increased with time to detect invalid characters (anything outside of [-a-zA-Z0-9.%].) Note that the URI should not start or end with a period. This error will also be returned (at some point) when the function detects such problems. |
TLD_RESULT_NOT_FOUND |
The URI has a TLD that could not be determined. The TLD of the URI was searched in the TLD data and could not be found there. This means the TLD is not a valid Internet TLD. |
enum tld_status |
Each TLD has a status. By default, it is generally considered valid, however, many TLDs are either proposed or deprecated.
Proposed TLDs are not yet officially accepted by the official entities taking care of those TLDs. They should be refused, but may become available later.
Deprecated TLDs were in use before but got dropped. They may be dropped because a country doesn't follow up on their Internet TLD, or because the extension is found to be boycotted.
enum tld_result tld | ( | const char * | uri, |
struct tld_info * | info | ||
) |
The tld() function searches for the specified URI in the TLD descriptions. The results are saved in the info parameter for later interpretetation (i.e. extraction of the domain name, sub-domains and the exact TLD.)
The function extracts the last extension of the URI. For example, in the following:
example.co.uk
the function first extracts ".uk". With that extension, it searches the list of official TLDs. If not found, an error is returned and the info parameter is set to unknown.
When found, the function checks whether that TLD (".uk" in our previous example) accepts sub-TLDs (second, third, forth level TLDs.) If so, it extracts the next TLD entry (the ".co" in our previous example) and searches for that second level TLD. If found, we again try with the third level, etc. until all the possible TLDs were exhausted. At that point, we return the last TLD we have found. In case of ".co.uk", we return the information of the ".co" TLD, second-level domain name.
The info
structure includes:
Assuming that you always get valid URIs, you should get one of those results:
Other results are return when the input string is considered invalid.
// from example.c #include "tld.h" #include <stdio.h> int main() { char *uri = "www.example.co.uk"; struct tld_info info; enum tld_result r; r = tld(uri, &info); if(r == TLD_RESULT_SUCCESS) { const char *tld = info.f_tld; const char *s = uri + info.f_offset - 1; while(s > uri) { if(*s == '.') { ++s; break; } --s; } // here uri points to your sub-domains, the length is "s - uri" // if uri == s then there are no sub-domains // s points to the domain name, the length is "info.f_tld - s" // and info.f_tld points to the TLD // // When TLD_RESULT_SUCCESS is returned the domain cannot be an // empty string; also the TLD cannot be empty, however, there // may be no sub-domains. printf("Sub-domain(s): \"%.*s\"\n", (int)(s - uri), uri); printf("Domain: \"%.*s\"\n", (int)(info.f_tld - s), s); printf("TLD: \"%s\"\n", info.f_tld); } }
[in] | uri | The URI to be checked. |
[out] | info | A pointer to a tld_info structure to save the result. |
Definition at line 315 of file tld.c.
References tld_description::f_category, tld_info::f_category, tld_description::f_country, tld_info::f_country, tld_description::f_end_offset, tld_description::f_exception_apply_to, tld_description::f_exception_level, tld_info::f_offset, tld_description::f_start_offset, tld_description::f_status, tld_info::f_status, tld_info::f_tld, search(), TLD_CATEGORY_UNDEFINED, tld_descriptions, tld_end_offset, tld_max_level, TLD_RESULT_BAD_URI, TLD_RESULT_INVALID, TLD_RESULT_NO_TLD, TLD_RESULT_NOT_FOUND, TLD_RESULT_NULL, TLD_RESULT_SUCCESS, tld_start_offset, TLD_STATUS_EXCEPTION, TLD_STATUS_UNDEFINED, and TLD_STATUS_VALID.
Referenced by main(), snap::output_tlds(), snap::read_tlds(), search(), test_all(), test_invalid(), test_tlds(), and test_unknown().
const char* tld_version | ( | ) |
This functino returns the version of this library. The version is defined with three numbers: <major>.<minor>.<patch>.
You should be able to use the libversion to compare different libtld versions and know which one is the newest version.
Definition at line 432 of file tld.c.
References LIBTLD_VERSION.
Referenced by main().
This document is part of the libtld Project.
Copyright by Made to Order Software Corp.