libtld 1.2.0
|
Go to the source code of this file.
Functions | |
int | cmp (const char *a, const char *b, int n) |
Compare two strings, one of which is limited by length. | |
int | search (int i, int j, const char *domain, int n) |
Search for the specified domain. | |
enum tld_result | tld (const char *uri, struct tld_info *info) |
Get information about the TLD for the specified URI. | |
const char * | tld_version () |
Return the version of the library. |
int cmp | ( | const char * | a, |
const char * | b, | ||
int | n | ||
) |
This internal function was created to handle a simple string (no locale) comparison with one string being limited in length.
The comparison does not require locale since all characters are ASCII (a URI with Unicode characters encode them in UTF-8 and changes all those bytes with XX.)
The length applied to the string in b
. This allows us to make use of the input string all the way down to the cmp() function. In other words, we avoid a copy of the string.
The string in a
is 'nul' (\0) terminated. This means a
may be longer or shorter than b
. In other words, the function is capable of returning the correct result with a single call.
[in] | a | The pointer in an f_tld field of the tld_descriptions. |
[in] | b | Pointer directly in referencing the user domain string. |
[in] | n | The number of characters that can be checked in b . |
Definition at line 110 of file tld.c.
Referenced by search(), and test_compare().
int search | ( | int | i, |
int | j, | ||
const char * | domain, | ||
int | n | ||
) |
This function executes one search for one domain. The search is binary, which means the tld_descriptions are expected to be 100% in order at all levels.
The i
and j
parameters represent the boundaries of the current level to be checked. Know that for a given TLD, there is a start and end boundary that is used to define i
and j
. So except for the top level, the bounds are limited to one TLD, sub-TLD, etc. (for example, .uk has a sub-layer with .co, .ac, etc. and that ground is limited to the second level entries accepted within the .uk TLD.)
This search does one search at one level. If sub-levels are available for that TLD, then it is the responsibility of the caller to call the function again to find out whether one of those sub-domain name is in use.
When the TLD cannot be found, the function returns -1.
[in] | i | The start point of the search (included.) |
[in] | j | The end point of the search (excluded.) |
[in] | domain | The domain name to search. |
[in] | n | The length of the domain name. |
Definition at line 169 of file tld.c.
References cmp(), tld_description::f_tld, tld(), and tld_descriptions.
Referenced by test_search(), test_search_array(), and tld().
enum tld_result tld | ( | const char * | uri, |
struct tld_info * | info | ||
) |
The tld() function searches for the specified URI in the TLD descriptions. The results are saved in the info parameter for later interpretetation (i.e. extraction of the domain name, sub-domains and the exact TLD.)
The function extracts the last extension of the URI. For example, in the following:
example.co.uk
the function first extracts ".uk". With that extension, it searches the list of official TLDs. If not found, an error is returned and the info parameter is set to unknown.
When found, the function checks whether that TLD (".uk" in our previous example) accepts sub-TLDs (second, third, forth level TLDs.) If so, it extracts the next TLD entry (the ".co" in our previous example) and searches for that second level TLD. If found, we again try with the third level, etc. until all the possible TLDs were exhausted. At that point, we return the last TLD we have found. In case of ".co.uk", we return the information of the ".co" TLD, second-level domain name.
The info
structure includes:
Assuming that you always get valid URIs, you should get one of those results:
Other results are return when the input string is considered invalid.
// from example.c #include "tld.h" #include <stdio.h> int main() { char *uri = "www.example.co.uk"; struct tld_info info; enum tld_result r; r = tld(uri, &info); if(r == TLD_RESULT_SUCCESS) { const char *tld = info.f_tld; const char *s = uri + info.f_offset - 1; while(s > uri) { if(*s == '.') { ++s; break; } --s; } // here uri points to your sub-domains, the length is "s - uri" // if uri == s then there are no sub-domains // s points to the domain name, the length is "info.f_tld - s" // and info.f_tld points to the TLD // // When TLD_RESULT_SUCCESS is returned the domain cannot be an // empty string; also the TLD cannot be empty, however, there // may be no sub-domains. printf("Sub-domain(s): \"%.*s\"\n", (int)(s - uri), uri); printf("Domain: \"%.*s\"\n", (int)(info.f_tld - s), s); printf("TLD: \"%s\"\n", info.f_tld); } }
[in] | uri | The URI to be checked. |
[out] | info | A pointer to a tld_info structure to save the result. |
Definition at line 315 of file tld.c.
References tld_description::f_category, tld_info::f_category, tld_description::f_country, tld_info::f_country, tld_description::f_end_offset, tld_description::f_exception_apply_to, tld_description::f_exception_level, tld_info::f_offset, tld_description::f_start_offset, tld_description::f_status, tld_info::f_status, tld_info::f_tld, search(), TLD_CATEGORY_UNDEFINED, tld_descriptions, tld_end_offset, tld_max_level, TLD_RESULT_BAD_URI, TLD_RESULT_INVALID, TLD_RESULT_NO_TLD, TLD_RESULT_NOT_FOUND, TLD_RESULT_NULL, TLD_RESULT_SUCCESS, tld_start_offset, TLD_STATUS_EXCEPTION, TLD_STATUS_UNDEFINED, and TLD_STATUS_VALID.
Referenced by main(), snap::output_tlds(), snap::read_tlds(), search(), test_all(), test_invalid(), test_tlds(), and test_unknown().
const char* tld_version | ( | ) |
This functino returns the version of this library. The version is defined with three numbers: <major>.<minor>.<patch>.
You should be able to use the libversion to compare different libtld versions and know which one is the newest version.
Definition at line 432 of file tld.c.
References LIBTLD_VERSION.
Referenced by main().
This document is part of the libtld Project.
Copyright by Made to Order Software Corp.