Merge branch 'ebassi/girepository' into 'main'

Move libgirepository into GLib

Closes #455, #457, #49, #13, #318, #298, #38, #200, #96, #244, #175, and #218

See merge request GNOME/glib!3642
This commit is contained in:
Philip Withnall 2023-11-08 12:24:03 +00:00
commit 2787a86693
141 changed files with 39445 additions and 0 deletions

View File

@ -41,3 +41,10 @@ License: CC0-1.0
Files: docs/reference/glib/gvariant-*.svg
Copyright: 2022 Philip Withnall
License: CC-BY-SA-3.0
# libgirepository uses cmph as a copylib. Adding copyright/license data to the
# files there would cause divergence from upstream. See
# girepository/cmph/README-CMPH-IMPORT.txt.
Files: girepository/cmph/*
Copyright: CMPH contributors
License: LGPL-2.1-only or MPL-1.1

175
LICENSES/LGPL-2.1-only.txt Normal file
View File

@ -0,0 +1,175 @@
GNU LESSER GENERAL PUBLIC LICENSE
Version 2.1, February 1999
Copyright (C) 1991, 1999 Free Software Foundation, Inc.
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed.
[This is the first released version of the Lesser GPL. It also counts as the successor of the GNU Library Public License, version 2, hence the version number 2.1.]
Preamble
The licenses for most software are designed to take away your freedom to share and change it. By contrast, the GNU General Public Licenses are intended to guarantee your freedom to share and change free software--to make sure the software is free for all its users.
This license, the Lesser General Public License, applies to some specially designated software packages--typically libraries--of the Free Software Foundation and other authors who decide to use it. You can use it too, but we suggest you first think carefully about whether this license or the ordinary General Public License is the better strategy to use in any particular case, based on the explanations below.
When we speak of free software, we are referring to freedom of use, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for this service if you wish); that you receive source code or can get it if you want it; that you can change the software and use pieces of it in new free programs; and that you are informed that you can do these things.
To protect your rights, we need to make restrictions that forbid distributors to deny you these rights or to ask you to surrender these rights. These restrictions translate to certain responsibilities for you if you distribute copies of the library or if you modify it.
For example, if you distribute copies of the library, whether gratis or for a fee, you must give the recipients all the rights that we gave you. You must make sure that they, too, receive or can get the source code. If you link other code with the library, you must provide complete object files to the recipients, so that they can relink them with the library after making changes to the library and recompiling it. And you must show them these terms so they know their rights.
We protect your rights with a two-step method: (1) we copyright the library, and (2) we offer you this license, which gives you legal permission to copy, distribute and/or modify the library.
To protect each distributor, we want to make it very clear that there is no warranty for the free library. Also, if the library is modified by someone else and passed on, the recipients should know that what they have is not the original version, so that the original author's reputation will not be affected by problems that might be introduced by others.
Finally, software patents pose a constant threat to the existence of any free program. We wish to make sure that a company cannot effectively restrict the users of a free program by obtaining a restrictive license from a patent holder. Therefore, we insist that any patent license obtained for a version of the library must be consistent with the full freedom of use specified in this license.
Most GNU software, including some libraries, is covered by the ordinary GNU General Public License. This license, the GNU Lesser General Public License, applies to certain designated libraries, and is quite different from the ordinary General Public License. We use this license for certain libraries in order to permit linking those libraries into non-free programs.
When a program is linked with a library, whether statically or using a shared library, the combination of the two is legally speaking a combined work, a derivative of the original library. The ordinary General Public License therefore permits such linking only if the entire combination fits its criteria of freedom. The Lesser General Public License permits more lax criteria for linking other code with the library.
We call this license the "Lesser" General Public License because it does Less to protect the user's freedom than the ordinary General Public License. It also provides other free software developers Less of an advantage over competing non-free programs. These disadvantages are the reason we use the ordinary General Public License for many libraries. However, the Lesser license provides advantages in certain special circumstances.
For example, on rare occasions, there may be a special need to encourage the widest possible use of a certain library, so that it becomes a de-facto standard. To achieve this, non-free programs must be allowed to use the library. A more frequent case is that a free library does the same job as widely used non-free libraries. In this case, there is little to gain by limiting the free library to free software only, so we use the Lesser General Public License.
In other cases, permission to use a particular library in non-free programs enables a greater number of people to use a large body of free software. For example, permission to use the GNU C Library in non-free programs enables many more people to use the whole GNU operating system, as well as its variant, the GNU/Linux operating system.
Although the Lesser General Public License is Less protective of the users' freedom, it does ensure that the user of a program that is linked with the Library has the freedom and the wherewithal to run that program using a modified version of the Library.
The precise terms and conditions for copying, distribution and modification follow. Pay close attention to the difference between a "work based on the library" and a "work that uses the library". The former contains code derived from the library, whereas the latter must be combined with the library in order to run.
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
0. This License Agreement applies to any software library or other program which contains a notice placed by the copyright holder or other authorized party saying it may be distributed under the terms of this Lesser General Public License (also called "this License"). Each licensee is addressed as "you".
A "library" means a collection of software functions and/or data prepared so as to be conveniently linked with application programs (which use some of those functions and data) to form executables.
The "Library", below, refers to any such software library or work which has been distributed under these terms. A "work based on the Library" means either the Library or any derivative work under copyright law: that is to say, a work containing the Library or a portion of it, either verbatim or with modifications and/or translated straightforwardly into another language. (Hereinafter, translation is included without limitation in the term "modification".)
"Source code" for a work means the preferred form of the work for making modifications to it. For a library, complete source code means all the source code for all modules it contains, plus any associated interface definition files, plus the scripts used to control compilation and installation of the library.
Activities other than copying, distribution and modification are not covered by this License; they are outside its scope. The act of running a program using the Library is not restricted, and output from such a program is covered only if its contents constitute a work based on the Library (independent of the use of the Library in a tool for writing it). Whether that is true depends on what the Library does and what the program that uses the Library does.
1. You may copy and distribute verbatim copies of the Library's complete source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice and disclaimer of warranty; keep intact all the notices that refer to this License and to the absence of any warranty; and distribute a copy of this License along with the Library.
You may charge a fee for the physical act of transferring a copy, and you may at your option offer warranty protection in exchange for a fee.
2. You may modify your copy or copies of the Library or any portion of it, thus forming a work based on the Library, and copy and distribute such modifications or work under the terms of Section 1 above, provided that you also meet all of these conditions:
a) The modified work must itself be a software library.
b) You must cause the files modified to carry prominent notices stating that you changed the files and the date of any change.
c) You must cause the whole of the work to be licensed at no charge to all third parties under the terms of this License.
d) If a facility in the modified Library refers to a function or a table of data to be supplied by an application program that uses the facility, other than as an argument passed when the facility is invoked, then you must make a good faith effort to ensure that, in the event an application does not supply such function or table, the facility still operates, and performs whatever part of its purpose remains meaningful.
(For example, a function in a library to compute square roots has a purpose that is entirely well-defined independent of the application. Therefore, Subsection 2d requires that any application-supplied function or table used by this function must be optional: if the application does not supply it, the square root function must still compute square roots.)
These requirements apply to the modified work as a whole. If identifiable sections of that work are not derived from the Library, and can be reasonably considered independent and separate works in themselves, then this License, and its terms, do not apply to those sections when you distribute them as separate works. But when you distribute the same sections as part of a whole which is a work based on the Library, the distribution of the whole must be on the terms of this License, whose permissions for other licensees extend to the entire whole, and thus to each and every part regardless of who wrote it.
Thus, it is not the intent of this section to claim rights or contest your rights to work written entirely by you; rather, the intent is to exercise the right to control the distribution of derivative or collective works based on the Library.
In addition, mere aggregation of another work not based on the Library with the Library (or with a work based on the Library) on a volume of a storage or distribution medium does not bring the other work under the scope of this License.
3. You may opt to apply the terms of the ordinary GNU General Public License instead of this License to a given copy of the Library. To do this, you must alter all the notices that refer to this License, so that they refer to the ordinary GNU General Public License, version 2, instead of to this License. (If a newer version than version 2 of the ordinary GNU General Public License has appeared, then you can specify that version instead if you wish.) Do not make any other change in these notices.
Once this change is made in a given copy, it is irreversible for that copy, so the ordinary GNU General Public License applies to all subsequent copies and derivative works made from that copy.
This option is useful when you wish to copy part of the code of the Library into a program that is not a library.
4. You may copy and distribute the Library (or a portion or derivative of it, under Section 2) in object code or executable form under the terms of Sections 1 and 2 above provided that you accompany it with the complete corresponding machine-readable source code, which must be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange.
If distribution of object code is made by offering access to copy from a designated place, then offering equivalent access to copy the source code from the same place satisfies the requirement to distribute the source code, even though third parties are not compelled to copy the source along with the object code.
5. A program that contains no derivative of any portion of the Library, but is designed to work with the Library by being compiled or linked with it, is called a "work that uses the Library". Such a work, in isolation, is not a derivative work of the Library, and therefore falls outside the scope of this License.
However, linking a "work that uses the Library" with the Library creates an executable that is a derivative of the Library (because it contains portions of the Library), rather than a "work that uses the library". The executable is therefore covered by this License. Section 6 states terms for distribution of such executables.
When a "work that uses the Library" uses material from a header file that is part of the Library, the object code for the work may be a derivative work of the Library even though the source code is not. Whether this is true is especially significant if the work can be linked without the Library, or if the work is itself a library. The threshold for this to be true is not precisely defined by law.
If such an object file uses only numerical parameters, data structure layouts and accessors, and small macros and small inline functions (ten lines or less in length), then the use of the object file is unrestricted, regardless of whether it is legally a derivative work. (Executables containing this object code plus portions of the Library will still fall under Section 6.)
Otherwise, if the work is a derivative of the Library, you may distribute the object code for the work under the terms of Section 6. Any executables containing that work also fall under Section 6, whether or not they are linked directly with the Library itself.
6. As an exception to the Sections above, you may also combine or link a "work that uses the Library" with the Library to produce a work containing portions of the Library, and distribute that work under terms of your choice, provided that the terms permit modification of the work for the customer's own use and reverse engineering for debugging such modifications.
You must give prominent notice with each copy of the work that the Library is used in it and that the Library and its use are covered by this License. You must supply a copy of this License. If the work during execution displays copyright notices, you must include the copyright notice for the Library among them, as well as a reference directing the user to the copy of this License. Also, you must do one of these things:
a) Accompany the work with the complete corresponding machine-readable source code for the Library including whatever changes were used in the work (which must be distributed under Sections 1 and 2 above); and, if the work is an executable linked with the Library, with the complete machine-readable "work that uses the Library", as object code and/or source code, so that the user can modify the Library and then relink to produce a modified executable containing the modified Library. (It is understood that the user who changes the contents of definitions files in the Library will not necessarily be able to recompile the application to use the modified definitions.)
b) Use a suitable shared library mechanism for linking with the Library. A suitable mechanism is one that (1) uses at run time a copy of the library already present on the user's computer system, rather than copying library functions into the executable, and (2) will operate properly with a modified version of the library, if the user installs one, as long as the modified version is interface-compatible with the version that the work was made with.
c) Accompany the work with a written offer, valid for at least three years, to give the same user the materials specified in Subsection 6a, above, for a charge no more than the cost of performing this distribution.
d) If distribution of the work is made by offering access to copy from a designated place, offer equivalent access to copy the above specified materials from the same place.
e) Verify that the user has already received a copy of these materials or that you have already sent this user a copy.
For an executable, the required form of the "work that uses the Library" must include any data and utility programs needed for reproducing the executable from it. However, as a special exception, the materials to be distributed need not include anything that is normally distributed (in either source or binary form) with the major components (compiler, kernel, and so on) of the operating system on which the executable runs, unless that component itself accompanies the executable.
It may happen that this requirement contradicts the license restrictions of other proprietary libraries that do not normally accompany the operating system. Such a contradiction means you cannot use both them and the Library together in an executable that you distribute.
7. You may place library facilities that are a work based on the Library side-by-side in a single library together with other library facilities not covered by this License, and distribute such a combined library, provided that the separate distribution of the work based on the Library and of the other library facilities is otherwise permitted, and provided that you do these two things:
a) Accompany the combined library with a copy of the same work based on the Library, uncombined with any other library facilities. This must be distributed under the terms of the Sections above.
b) Give prominent notice with the combined library of the fact that part of it is a work based on the Library, and explaining where to find the accompanying uncombined form of the same work.
8. You may not copy, modify, sublicense, link with, or distribute the Library except as expressly provided under this License. Any attempt otherwise to copy, modify, sublicense, link with, or distribute the Library is void, and will automatically terminate your rights under this License. However, parties who have received copies, or rights, from you under this License will not have their licenses terminated so long as such parties remain in full compliance.
9. You are not required to accept this License, since you have not signed it. However, nothing else grants you permission to modify or distribute the Library or its derivative works. These actions are prohibited by law if you do not accept this License. Therefore, by modifying or distributing the Library (or any work based on the Library), you indicate your acceptance of this License to do so, and all its terms and conditions for copying, distributing or modifying the Library or works based on it.
10. Each time you redistribute the Library (or any work based on the Library), the recipient automatically receives a license from the original licensor to copy, distribute, link with or modify the Library subject to these terms and conditions. You may not impose any further restrictions on the recipients' exercise of the rights granted herein. You are not responsible for enforcing compliance by third parties with this License.
11. If, as a consequence of a court judgment or allegation of patent infringement or for any other reason (not limited to patent issues), conditions are imposed on you (whether by court order, agreement or otherwise) that contradict the conditions of this License, they do not excuse you from the conditions of this License. If you cannot distribute so as to satisfy simultaneously your obligations under this License and any other pertinent obligations, then as a consequence you may not distribute the Library at all. For example, if a patent license would not permit royalty-free redistribution of the Library by all those who receive copies directly or indirectly through you, then the only way you could satisfy both it and this License would be to refrain entirely from distribution of the Library.
If any portion of this section is held invalid or unenforceable under any particular circumstance, the balance of the section is intended to apply, and the section as a whole is intended to apply in other circumstances.
It is not the purpose of this section to induce you to infringe any patents or other property right claims or to contest validity of any such claims; this section has the sole purpose of protecting the integrity of the free software distribution system which is implemented by public license practices. Many people have made generous contributions to the wide range of software distributed through that system in reliance on consistent application of that system; it is up to the author/donor to decide if he or she is willing to distribute software through any other system and a licensee cannot impose that choice.
This section is intended to make thoroughly clear what is believed to be a consequence of the rest of this License.
12. If the distribution and/or use of the Library is restricted in certain countries either by patents or by copyrighted interfaces, the original copyright holder who places the Library under this License may add an explicit geographical distribution limitation excluding those countries, so that distribution is permitted only in or among countries not thus excluded. In such case, this License incorporates the limitation as if written in the body of this License.
13. The Free Software Foundation may publish revised and/or new versions of the Lesser General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns.
Each version is given a distinguishing version number. If the Library specifies a version number of this License which applies to it and "any later version", you have the option of following the terms and conditions either of that version or of any later version published by the Free Software Foundation. If the Library does not specify a license version number, you may choose any version ever published by the Free Software Foundation.
14. If you wish to incorporate parts of the Library into other free programs whose distribution conditions are incompatible with these, write to the author to ask for permission. For software which is copyrighted by the Free Software Foundation, write to the Free Software Foundation; we sometimes make exceptions for this. Our decision will be guided by the two goals of preserving the free status of all derivatives of our free software and of promoting the sharing and reuse of software generally.
NO WARRANTY
15. BECAUSE THE LIBRARY IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE LIBRARY, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE LIBRARY "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE LIBRARY IS WITH YOU. SHOULD THE LIBRARY PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
16. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE LIBRARY AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE LIBRARY (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE LIBRARY TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
END OF TERMS AND CONDITIONS
How to Apply These Terms to Your New Libraries
If you develop a new library, and you want it to be of the greatest possible use to the public, we recommend making it free software that everyone can redistribute and change. You can do so by permitting redistribution under these terms (or, alternatively, under the terms of the ordinary General Public License).
To apply these terms, attach the following notices to the library. It is safest to attach them to the start of each source file to most effectively convey the exclusion of warranty; and each file should have at least the "copyright" line and a pointer to where the full notice is found.
one line to give the library's name and an idea of what it does.
Copyright (C) year name of author
This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version.
This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public License along with this library; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA Also add information on how to contact you by electronic and paper mail.
You should also get your employer (if you work as a programmer) or your school, if any, to sign a "copyright disclaimer" for the library, if necessary. Here is a sample; alter the names:
Yoyodyne, Inc., hereby disclaims all copyright interest in
the library `Frob' (a library for tweaking knobs) written
by James Random Hacker.
signature of Ty Coon, 1 April 1990
Ty Coon, President of Vice
That's all there is to it!

143
LICENSES/MPL-1.1.txt Normal file
View File

@ -0,0 +1,143 @@
Mozilla Public License Version 1.1
1. Definitions.
1.0.1. "Commercial Use" means distribution or otherwise making the Covered Code available to a third party.
1.1. "Contributor" means each entity that creates or contributes to the creation of Modifications.
1.2. "Contributor Version" means the combination of the Original Code, prior Modifications used by a Contributor, and the Modifications made by that particular Contributor.
1.3. "Covered Code" means the Original Code or Modifications or the combination of the Original Code and Modifications, in each case including portions thereof.
1.4. "Electronic Distribution Mechanism" means a mechanism generally accepted in the software development community for the electronic transfer of data.
1.5. "Executable" means Covered Code in any form other than Source Code.
1.6. "Initial Developer" means the individual or entity identified as the Initial Developer in the Source Code notice required by Exhibit A.
1.7. "Larger Work" means a work which combines Covered Code or portions thereof with code not governed by the terms of this License.
1.8. "License" means this document.
1.8.1. "Licensable" means having the right to grant, to the maximum extent possible, whether at the time of the initial grant or subsequently acquired, any and all of the rights conveyed herein.
1.9. "Modifications" means any addition to or deletion from the substance or structure of either the Original Code or any previous Modifications. When Covered Code is released as a series of files, a Modification is:
Any addition to or deletion from the contents of a file containing Original Code or previous Modifications.
Any new file that contains any part of the Original Code or previous Modifications.
1.10. "Original Code" means Source Code of computer software code which is described in the Source Code notice required by Exhibit A as Original Code, and which, at the time of its release under this License is not already Covered Code governed by this License.
1.10.1. "Patent Claims" means any patent claim(s), now owned or hereafter acquired, including without limitation, method, process, and apparatus claims, in any patent Licensable by grantor.
1.11. "Source Code" means the preferred form of the Covered Code for making modifications to it, including all modules it contains, plus any associated interface definition files, scripts used to control compilation and installation of an Executable, or source code differential comparisons against either the Original Code or another well known, available Covered Code of the Contributor's choice. The Source Code can be in a compressed or archival form, provided the appropriate decompression or de-archiving software is widely available for no charge.
1.12. "You" (or "Your") means an individual or a legal entity exercising rights under, and complying with all of the terms of, this License or a future version of this License issued under Section 6.1. For legal entities, "You" includes any entity which controls, is controlled by, or is under common control with You. For purposes of this definition, "control" means (a) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (b) ownership of more than fifty percent (50%) of the outstanding shares or beneficial ownership of such entity.
2. Source Code License.
2.1. The Initial Developer Grant. The Initial Developer hereby grants You a world-wide, royalty-free, non-exclusive license, subject to third party intellectual property claims:
a. under intellectual property rights (other than patent or trademark) Licensable by Initial Developer to use, reproduce, modify, display, perform, sublicense and distribute the Original Code (or portions thereof) with or without Modifications, and/or as part of a Larger Work; and
b. under Patents Claims infringed by the making, using or selling of Original Code, to make, have made, use, practice, sell, and offer for sale, and/or otherwise dispose of the Original Code (or portions thereof).
c. the licenses granted in this Section 2.1 (a) and (b) are effective on the date Initial Developer first distributes Original Code under the terms of this License.
d. Notwithstanding Section 2.1 (b) above, no patent license is granted: 1) for code that You delete from the Original Code; 2) separate from the Original Code; or 3) for infringements caused by: i) the modification of the Original Code or ii) the combination of the Original Code with other software or devices.
2.2. Contributor Grant. Subject to third party intellectual property claims, each Contributor hereby grants You a world-wide, royalty-free, non-exclusive license
a. under intellectual property rights (other than patent or trademark) Licensable by Contributor, to use, reproduce, modify, display, perform, sublicense and distribute the Modifications created by such Contributor (or portions thereof) either on an unmodified basis, with other Modifications, as Covered Code and/or as part of a Larger Work; and
b. under Patent Claims infringed by the making, using, or selling of Modifications made by that Contributor either alone and/or in combination with its Contributor Version (or portions of such combination), to make, use, sell, offer for sale, have made, and/or otherwise dispose of: 1) Modifications made by that Contributor (or portions thereof); and 2) the combination of Modifications made by that Contributor with its Contributor Version (or portions of such combination).
c. the licenses granted in Sections 2.2 (a) and 2.2 (b) are effective on the date Contributor first makes Commercial Use of the Covered Code.
d. Notwithstanding Section 2.2 (b) above, no patent license is granted: 1) for any code that Contributor has deleted from the Contributor Version; 2) separate from the Contributor Version; 3) for infringements caused by: i) third party modifications of Contributor Version or ii) the combination of Modifications made by that Contributor with other software (except as part of the Contributor Version) or other devices; or 4) under Patent Claims infringed by Covered Code in the absence of Modifications made by that Contributor.
3. Distribution Obligations.
3.1. Application of License. The Modifications which You create or to which You contribute are governed by the terms of this License, including without limitation Section 2.2. The Source Code version of Covered Code may be distributed only under the terms of this License or a future version of this License released under Section 6.1, and You must include a copy of this License with every copy of the Source Code You distribute. You may not offer or impose any terms on any Source Code version that alters or restricts the applicable version of this License or the recipients' rights hereunder. However, You may include an additional document offering the additional rights described in Section 3.5.
3.2. Availability of Source Code. Any Modification which You create or to which You contribute must be made available in Source Code form under the terms of this License either on the same media as an Executable version or via an accepted Electronic Distribution Mechanism to anyone to whom you made an Executable version available; and if made available via Electronic Distribution Mechanism, must remain available for at least twelve (12) months after the date it initially became available, or at least six (6) months after a subsequent version of that particular Modification has been made available to such recipients. You are responsible for ensuring that the Source Code version remains available even if the Electronic Distribution Mechanism is maintained by a third party.
3.3. Description of Modifications. You must cause all Covered Code to which You contribute to contain a file documenting the changes You made to create that Covered Code and the date of any change. You must include a prominent statement that the Modification is derived, directly or indirectly, from Original Code provided by the Initial Developer and including the name of the Initial Developer in (a) the Source Code, and (b) in any notice in an Executable version or related documentation in which You describe the origin or ownership of the Covered Code.
3.4. Intellectual Property Matters
(a) Third Party Claims
If Contributor has knowledge that a license under a third party's intellectual property rights is required to exercise the rights granted by such Contributor under Sections 2.1 or 2.2, Contributor must include a text file with the Source Code distribution titled "LEGAL" which describes the claim and the party making the claim in sufficient detail that a recipient will know whom to contact. If Contributor obtains such knowledge after the Modification is made available as described in Section 3.2, Contributor shall promptly modify the LEGAL file in all copies Contributor makes available thereafter and shall take other steps (such as notifying appropriate mailing lists or newsgroups) reasonably calculated to inform those who received the Covered Code that new knowledge has been obtained.
(b) Contributor APIs
If Contributor's Modifications include an application programming interface and Contributor has knowledge of patent licenses which are reasonably necessary to implement that API, Contributor must also include this information in the LEGAL file.
(c) Representations.
Contributor represents that, except as disclosed pursuant to Section 3.4 (a) above, Contributor believes that Contributor's Modifications are Contributor's original creation(s) and/or Contributor has sufficient rights to grant the rights conveyed by this License.
3.5. Required Notices. You must duplicate the notice in Exhibit A in each file of the Source Code. If it is not possible to put such notice in a particular Source Code file due to its structure, then You must include such notice in a location (such as a relevant directory) where a user would be likely to look for such a notice. If You created one or more Modification(s) You may add your name as a Contributor to the notice described in Exhibit A. You must also duplicate this License in any documentation for the Source Code where You describe recipients' rights or ownership rights relating to Covered Code. You may choose to offer, and to charge a fee for, warranty, support, indemnity or liability obligations to one or more recipients of Covered Code. However, You may do so only on Your own behalf, and not on behalf of the Initial Developer or any Contributor. You must make it absolutely clear than any such warranty, support, indemnity or liability obligation is offered by You alone, and You hereby agree to indemnify the Initial Developer and every Contributor for any liability incurred by the Initial Developer or such Contributor as a result of warranty, support, indemnity or liability terms You offer.
3.6. Distribution of Executable Versions. You may distribute Covered Code in Executable form only if the requirements of Sections 3.1, 3.2, 3.3, 3.4 and 3.5 have been met for that Covered Code, and if You include a notice stating that the Source Code version of the Covered Code is available under the terms of this License, including a description of how and where You have fulfilled the obligations of Section 3.2. The notice must be conspicuously included in any notice in an Executable version, related documentation or collateral in which You describe recipients' rights relating to the Covered Code. You may distribute the Executable version of Covered Code or ownership rights under a license of Your choice, which may contain terms different from this License, provided that You are in compliance with the terms of this License and that the license for the Executable version does not attempt to limit or alter the recipient's rights in the Source Code version from the rights set forth in this License. If You distribute the Executable version under a different license You must make it absolutely clear that any terms which differ from this License are offered by You alone, not by the Initial Developer or any Contributor. You hereby agree to indemnify the Initial Developer and every Contributor for any liability incurred by the Initial Developer or such Contributor as a result of any such terms You offer.
3.7. Larger Works. You may create a Larger Work by combining Covered Code with other code not governed by the terms of this License and distribute the Larger Work as a single product. In such a case, You must make sure the requirements of this License are fulfilled for the Covered Code.
4. Inability to Comply Due to Statute or Regulation.
If it is impossible for You to comply with any of the terms of this License with respect to some or all of the Covered Code due to statute, judicial order, or regulation then You must: (a) comply with the terms of this License to the maximum extent possible; and (b) describe the limitations and the code they affect. Such description must be included in the LEGAL file described in Section 3.4 and must be included with all distributions of the Source Code. Except to the extent prohibited by statute or regulation, such description must be sufficiently detailed for a recipient of ordinary skill to be able to understand it.
5. Application of this License.
This License applies to code to which the Initial Developer has attached the notice in Exhibit A and to related Covered Code.
6. Versions of the License.
6.1. New Versions
Netscape Communications Corporation ("Netscape") may publish revised and/or new versions of the License from time to time. Each version will be given a distinguishing version number.
6.2. Effect of New Versions
Once Covered Code has been published under a particular version of the License, You may always continue to use it under the terms of that version. You may also choose to use such Covered Code under the terms of any subsequent version of the License published by Netscape. No one other than Netscape has the right to modify the terms applicable to Covered Code created under this License.
6.3. Derivative Works
If You create or use a modified version of this License (which you may only do in order to apply it to code which is not already Covered Code governed by this License), You must (a) rename Your license so that the phrases "Mozilla", "MOZILLAPL", "MOZPL", "Netscape", "MPL", "NPL" or any confusingly similar phrase do not appear in your license (except to note that your license differs from this License) and (b) otherwise make it clear that Your version of the license contains terms which differ from the Mozilla Public License and Netscape Public License. (Filling in the name of the Initial Developer, Original Code or Contributor in the notice described in Exhibit A shall not of themselves be deemed to be modifications of this License.)
7. DISCLAIMER OF WARRANTY
COVERED CODE IS PROVIDED UNDER THIS LICENSE ON AN "AS IS" BASIS, WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, WITHOUT LIMITATION, WARRANTIES THAT THE COVERED CODE IS FREE OF DEFECTS, MERCHANTABLE, FIT FOR A PARTICULAR PURPOSE OR NON-INFRINGING. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE COVERED CODE IS WITH YOU. SHOULD ANY COVERED CODE PROVE DEFECTIVE IN ANY RESPECT, YOU (NOT THE INITIAL DEVELOPER OR ANY OTHER CONTRIBUTOR) ASSUME THE COST OF ANY NECESSARY SERVICING, REPAIR OR CORRECTION. THIS DISCLAIMER OF WARRANTY CONSTITUTES AN ESSENTIAL PART OF THIS LICENSE. NO USE OF ANY COVERED CODE IS AUTHORIZED HEREUNDER EXCEPT UNDER THIS DISCLAIMER.
8. Termination
8.1. This License and the rights granted hereunder will terminate automatically if You fail to comply with terms herein and fail to cure such breach within 30 days of becoming aware of the breach. All sublicenses to the Covered Code which are properly granted shall survive any termination of this License. Provisions which, by their nature, must remain in effect beyond the termination of this License shall survive.
8.2. If You initiate litigation by asserting a patent infringement claim (excluding declatory judgment actions) against Initial Developer or a Contributor (the Initial Developer or Contributor against whom You file such action is referred to as "Participant") alleging that:
a. such Participant's Contributor Version directly or indirectly infringes any patent, then any and all rights granted by such Participant to You under Sections 2.1 and/or 2.2 of this License shall, upon 60 days notice from Participant terminate prospectively, unless if within 60 days after receipt of notice You either: (i) agree in writing to pay Participant a mutually agreeable reasonable royalty for Your past and future use of Modifications made by such Participant, or (ii) withdraw Your litigation claim with respect to the Contributor Version against such Participant. If within 60 days of notice, a reasonable royalty and payment arrangement are not mutually agreed upon in writing by the parties or the litigation claim is not withdrawn, the rights granted by Participant to You under Sections 2.1 and/or 2.2 automatically terminate at the expiration of the 60 day notice period specified above.
b. any software, hardware, or device, other than such Participant's Contributor Version, directly or indirectly infringes any patent, then any rights granted to You by such Participant under Sections 2.1(b) and 2.2(b) are revoked effective as of the date You first made, used, sold, distributed, or had made, Modifications made by that Participant.
8.3. If You assert a patent infringement claim against Participant alleging that such Participant's Contributor Version directly or indirectly infringes any patent where such claim is resolved (such as by license or settlement) prior to the initiation of patent infringement litigation, then the reasonable value of the licenses granted by such Participant under Sections 2.1 or 2.2 shall be taken into account in determining the amount or value of any payment or license.
8.4. In the event of termination under Sections 8.1 or 8.2 above, all end user license agreements (excluding distributors and resellers) which have been validly granted by You or any distributor hereunder prior to termination shall survive termination.
9. LIMITATION OF LIABILITY
UNDER NO CIRCUMSTANCES AND UNDER NO LEGAL THEORY, WHETHER TORT (INCLUDING NEGLIGENCE), CONTRACT, OR OTHERWISE, SHALL YOU, THE INITIAL DEVELOPER, ANY OTHER CONTRIBUTOR, OR ANY DISTRIBUTOR OF COVERED CODE, OR ANY SUPPLIER OF ANY OF SUCH PARTIES, BE LIABLE TO ANY PERSON FOR ANY INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES OF ANY CHARACTER INCLUDING, WITHOUT LIMITATION, DAMAGES FOR LOSS OF GOODWILL, WORK STOPPAGE, COMPUTER FAILURE OR MALFUNCTION, OR ANY AND ALL OTHER COMMERCIAL DAMAGES OR LOSSES, EVEN IF SUCH PARTY SHALL HAVE BEEN INFORMED OF THE POSSIBILITY OF SUCH DAMAGES. THIS LIMITATION OF LIABILITY SHALL NOT APPLY TO LIABILITY FOR DEATH OR PERSONAL INJURY RESULTING FROM SUCH PARTY'S NEGLIGENCE TO THE EXTENT APPLICABLE LAW PROHIBITS SUCH LIMITATION. SOME JURISDICTIONS DO NOT ALLOW THE EXCLUSION OR LIMITATION OF INCIDENTAL OR CONSEQUENTIAL DAMAGES, SO THIS EXCLUSION AND LIMITATION MAY NOT APPLY TO YOU.
10. U.S. government end users
The Covered Code is a "commercial item," as that term is defined in 48 C.F.R. 2.101 (Oct. 1995), consisting of "commercial computer software" and "commercial computer software documentation," as such terms are used in 48 C.F.R. 12.212 (Sept. 1995). Consistent with 48 C.F.R. 12.212 and 48 C.F.R. 227.7202-1 through 227.7202-4 (June 1995), all U.S. Government End Users acquire Covered Code with only those rights set forth herein.
11. Miscellaneous
This License represents the complete agreement concerning subject matter hereof. If any provision of this License is held to be unenforceable, such provision shall be reformed only to the extent necessary to make it enforceable. This License shall be governed by California law provisions (except to the extent applicable law, if any, provides otherwise), excluding its conflict-of-law provisions. With respect to disputes in which at least one party is a citizen of, or an entity chartered or registered to do business in the United States of America, any litigation relating to this License shall be subject to the jurisdiction of the Federal Courts of the Northern District of California, with venue lying in Santa Clara County, California, with the losing party responsible for costs, including without limitation, court costs and reasonable attorneys' fees and expenses. The application of the United Nations Convention on Contracts for the International Sale of Goods is expressly excluded. Any law or regulation which provides that the language of a contract shall be construed against the drafter shall not apply to this License.
12. Responsibility for claims
As between Initial Developer and the Contributors, each party is responsible for claims and damages arising, directly or indirectly, out of its utilization of rights under this License and You agree to work with Initial Developer and Contributors to distribute such responsibility on an equitable basis. Nothing herein is intended or shall be deemed to constitute any admission of liability.
13. Multiple-licensed code
Initial Developer may designate portions of the Covered Code as "Multiple-Licensed". "Multiple-Licensed" means that the Initial Developer permits you to utilize portions of the Covered Code under Your choice of the MPL or the alternative licenses, if any, specified by the Initial Developer in the file described in Exhibit A.
Exhibit A - Mozilla Public License.
"The contents of this file are subject to the Mozilla Public License Version 1.1 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.mozilla.org/MPL/
Software distributed under the License is distributed on an "AS IS" basis, WITHOUT WARRANTY OF ANY KIND, either express or implied. See the License for the specific language governing rights and limitations under the License.
The Original Code is ______________________________________.
The Initial Developer of the Original Code is ________________________.
Portions created by ______________________ are Copyright (C) ______
_______________________. All Rights Reserved.
Contributor(s): ______________________________________.
Alternatively, the contents of this file may be used under the terms of the _____ license (the "[___] License"), in which case the provisions of [______] License are applicable instead of those above. If you wish to allow use of your version of this file only under the terms of the [____] License and not to allow others to use your version of this file under the MPL, indicate your decision by deleting the provisions above and replace them with the notice and other provisions required by the [___] License. If you do not delete the provisions above, a recipient may use your version of this file under either the MPL or the [___] License."
NOTE: The text of this Exhibit A may differ slightly from the text of the notices in the Source Code files of the Original Code. You should use the text of this Exhibit A rather than the text found in the Original Code Source Code for Your Modifications.

View File

@ -0,0 +1,142 @@
/* GObject introspection: Test cmph hashing
*
* Copyright (C) 2010 Red Hat, Inc.
*
* SPDX-License-Identifier: LGPL-2.1-or-later
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 02111-1307, USA.
*/
#include <glib-object.h>
#include "cmph.h"
static cmph_t *
build (void)
{
cmph_config_t *config;
cmph_io_adapter_t *io;
char **strings;
cmph_t *c;
guint32 size;
strings = g_strsplit ("foo,bar,baz", ",", -1);
io = cmph_io_vector_adapter (strings, g_strv_length (strings));
config = cmph_config_new (io);
cmph_config_set_algo (config, CMPH_BDZ);
c = cmph_new (config);
size = cmph_size (c);
g_assert (size == g_strv_length (strings));
return c;
}
static void
assert_hashes_unique (guint n_hashes,
guint32* hashes)
{
guint i;
for (i = 0; i < n_hashes; i++)
{
guint j = 0;
for (j = 0; j < n_hashes; j++)
{
if (j != i)
g_assert (hashes[i] != hashes[j]);
}
}
}
static void
test_search (void)
{
cmph_t *c = build();
guint i;
guint32 hash;
guint32 hashes[3];
guint32 size;
size = cmph_size (c);
i = 0;
hash = cmph_search (c, "foo", 3);
g_assert (hash >= 0 && hash < size);
hashes[i++] = hash;
hash = cmph_search (c, "bar", 3);
g_assert (hash >= 0 && hash < size);
hashes[i++] = hash;
hash = cmph_search (c, "baz", 3);
g_assert (hash >= 0 && hash < size);
hashes[i++] = hash;
assert_hashes_unique (G_N_ELEMENTS (hashes), &hashes[0]);
}
static void
test_search_packed (void)
{
cmph_t *c = build();
guint32 bufsize;
guint i;
guint32 hash;
guint32 hashes[3];
guint32 size;
guint8 *buf;
bufsize = cmph_packed_size (c);
buf = g_malloc (bufsize);
cmph_pack (c, buf);
size = cmph_size (c);
cmph_destroy (c);
c = NULL;
i = 0;
hash = cmph_search_packed (buf, "foo", 3);
g_assert (hash >= 0 && hash < size);
hashes[i++] = hash;
hash = cmph_search_packed (buf, "bar", 3);
g_assert (hash >= 0 && hash < size);
hashes[i++] = hash;
hash = cmph_search_packed (buf, "baz", 3);
g_assert (hash >= 0 && hash < size);
hashes[i++] = hash;
assert_hashes_unique (G_N_ELEMENTS (hashes), &hashes[0]);
}
int
main(int argc, char **argv)
{
gint ret;
g_test_init (&argc, &argv, NULL);
g_test_add_func ("/cmph-bdz/search", test_search);
g_test_add_func ("/cmph-bdz/search-packed", test_search_packed);
ret = g_test_run ();
return ret;
}

View File

@ -0,0 +1,5 @@
This import of CMPH was made from revision bfdcc3a3a18dfb9 of
git://cmph.git.sourceforge.net/gitroot/cmph/cmph
Only the following files were taken, and everything else deleted:
COPYING src/*.[ch]

721
girepository/cmph/bdz.c Normal file
View File

@ -0,0 +1,721 @@
#include "bdz.h"
#include "cmph_structs.h"
#include "bdz_structs.h"
#include "hash.h"
#include "bitbool.h"
#include <math.h>
#include <stdlib.h>
#include <stdio.h>
#include <assert.h>
#include <string.h>
#include <errno.h>
//#define DEBUG
#include "debug.h"
#define UNASSIGNED 3U
#define NULL_EDGE 0xffffffff
//cmph_uint32 ngrafos = 0;
//cmph_uint32 ngrafos_aciclicos = 0;
// table used for looking up the number of assigned vertices a 8-bit integer
const cmph_uint8 bdz_lookup_table[] =
{
4, 4, 4, 3, 4, 4, 4, 3, 4, 4, 4, 3, 3, 3, 3, 2,
4, 4, 4, 3, 4, 4, 4, 3, 4, 4, 4, 3, 3, 3, 3, 2,
4, 4, 4, 3, 4, 4, 4, 3, 4, 4, 4, 3, 3, 3, 3, 2,
3, 3, 3, 2, 3, 3, 3, 2, 3, 3, 3, 2, 2, 2, 2, 1,
4, 4, 4, 3, 4, 4, 4, 3, 4, 4, 4, 3, 3, 3, 3, 2,
4, 4, 4, 3, 4, 4, 4, 3, 4, 4, 4, 3, 3, 3, 3, 2,
4, 4, 4, 3, 4, 4, 4, 3, 4, 4, 4, 3, 3, 3, 3, 2,
3, 3, 3, 2, 3, 3, 3, 2, 3, 3, 3, 2, 2, 2, 2, 1,
4, 4, 4, 3, 4, 4, 4, 3, 4, 4, 4, 3, 3, 3, 3, 2,
4, 4, 4, 3, 4, 4, 4, 3, 4, 4, 4, 3, 3, 3, 3, 2,
4, 4, 4, 3, 4, 4, 4, 3, 4, 4, 4, 3, 3, 3, 3, 2,
3, 3, 3, 2, 3, 3, 3, 2, 3, 3, 3, 2, 2, 2, 2, 1,
3, 3, 3, 2, 3, 3, 3, 2, 3, 3, 3, 2, 2, 2, 2, 1,
3, 3, 3, 2, 3, 3, 3, 2, 3, 3, 3, 2, 2, 2, 2, 1,
3, 3, 3, 2, 3, 3, 3, 2, 3, 3, 3, 2, 2, 2, 2, 1,
2, 2, 2, 1, 2, 2, 2, 1, 2, 2, 2, 1, 1, 1, 1, 0
};
typedef struct
{
cmph_uint32 vertices[3];
cmph_uint32 next_edges[3];
}bdz_edge_t;
typedef cmph_uint32 * bdz_queue_t;
static void bdz_alloc_queue(bdz_queue_t * queuep, cmph_uint32 nedges)
{
(*queuep)=malloc(nedges*sizeof(cmph_uint32));
};
static void bdz_free_queue(bdz_queue_t * queue)
{
free(*queue);
};
typedef struct
{
cmph_uint32 nedges;
bdz_edge_t * edges;
cmph_uint32 * first_edge;
cmph_uint8 * vert_degree;
}bdz_graph3_t;
static void bdz_alloc_graph3(bdz_graph3_t * graph3, cmph_uint32 nedges, cmph_uint32 nvertices)
{
graph3->edges=malloc(nedges*sizeof(bdz_edge_t));
graph3->first_edge=malloc(nvertices*sizeof(cmph_uint32));
graph3->vert_degree=malloc((size_t)nvertices);
};
static void bdz_init_graph3(bdz_graph3_t * graph3, cmph_uint32 nedges, cmph_uint32 nvertices)
{
memset(graph3->first_edge,0xff,nvertices*sizeof(cmph_uint32));
memset(graph3->vert_degree,0,(size_t)nvertices);
graph3->nedges=0;
};
static void bdz_free_graph3(bdz_graph3_t *graph3)
{
free(graph3->edges);
free(graph3->first_edge);
free(graph3->vert_degree);
};
static void bdz_partial_free_graph3(bdz_graph3_t *graph3)
{
free(graph3->first_edge);
free(graph3->vert_degree);
graph3->first_edge = NULL;
graph3->vert_degree = NULL;
};
static void bdz_add_edge(bdz_graph3_t * graph3, cmph_uint32 v0, cmph_uint32 v1, cmph_uint32 v2)
{
graph3->edges[graph3->nedges].vertices[0]=v0;
graph3->edges[graph3->nedges].vertices[1]=v1;
graph3->edges[graph3->nedges].vertices[2]=v2;
graph3->edges[graph3->nedges].next_edges[0]=graph3->first_edge[v0];
graph3->edges[graph3->nedges].next_edges[1]=graph3->first_edge[v1];
graph3->edges[graph3->nedges].next_edges[2]=graph3->first_edge[v2];
graph3->first_edge[v0]=graph3->first_edge[v1]=graph3->first_edge[v2]=graph3->nedges;
graph3->vert_degree[v0]++;
graph3->vert_degree[v1]++;
graph3->vert_degree[v2]++;
graph3->nedges++;
};
static void bdz_dump_graph(bdz_graph3_t* graph3, cmph_uint32 nedges, cmph_uint32 nvertices)
{
cmph_uint32 i;
for(i=0;i<nedges;i++){
printf("\nedge %d %d %d %d ",i,graph3->edges[i].vertices[0],
graph3->edges[i].vertices[1],graph3->edges[i].vertices[2]);
printf(" nexts %d %d %d",graph3->edges[i].next_edges[0],
graph3->edges[i].next_edges[1],graph3->edges[i].next_edges[2]);
};
for(i=0;i<nvertices;i++){
printf("\nfirst for vertice %d %d ",i,graph3->first_edge[i]);
};
};
static void bdz_remove_edge(bdz_graph3_t * graph3, cmph_uint32 curr_edge)
{
cmph_uint32 i,j=0,vert,edge1,edge2;
for(i=0;i<3;i++){
vert=graph3->edges[curr_edge].vertices[i];
edge1=graph3->first_edge[vert];
edge2=NULL_EDGE;
while(edge1!=curr_edge&&edge1!=NULL_EDGE){
edge2=edge1;
if(graph3->edges[edge1].vertices[0]==vert){
j=0;
} else if(graph3->edges[edge1].vertices[1]==vert){
j=1;
} else
j=2;
edge1=graph3->edges[edge1].next_edges[j];
};
if(edge1==NULL_EDGE){
printf("\nerror remove edge %d dump graph",curr_edge);
bdz_dump_graph(graph3,graph3->nedges,graph3->nedges+graph3->nedges/4);
exit(-1);
};
if(edge2!=NULL_EDGE){
graph3->edges[edge2].next_edges[j] =
graph3->edges[edge1].next_edges[i];
} else
graph3->first_edge[vert]=
graph3->edges[edge1].next_edges[i];
graph3->vert_degree[vert]--;
};
};
static int bdz_generate_queue(cmph_uint32 nedges, cmph_uint32 nvertices, bdz_queue_t queue, bdz_graph3_t* graph3)
{
cmph_uint32 i,v0,v1,v2;
cmph_uint32 queue_head=0,queue_tail=0;
cmph_uint32 curr_edge;
cmph_uint32 tmp_edge;
cmph_uint8 * marked_edge =malloc((size_t)(nedges >> 3) + 1);
memset(marked_edge, 0, (size_t)(nedges >> 3) + 1);
for(i=0;i<nedges;i++){
v0=graph3->edges[i].vertices[0];
v1=graph3->edges[i].vertices[1];
v2=graph3->edges[i].vertices[2];
if(graph3->vert_degree[v0]==1 ||
graph3->vert_degree[v1]==1 ||
graph3->vert_degree[v2]==1){
if(!GETBIT(marked_edge,i)) {
queue[queue_head++]=i;
SETBIT(marked_edge,i);
}
};
};
while(queue_tail!=queue_head){
curr_edge=queue[queue_tail++];
bdz_remove_edge(graph3,curr_edge);
v0=graph3->edges[curr_edge].vertices[0];
v1=graph3->edges[curr_edge].vertices[1];
v2=graph3->edges[curr_edge].vertices[2];
if(graph3->vert_degree[v0]==1 ) {
tmp_edge=graph3->first_edge[v0];
if(!GETBIT(marked_edge,tmp_edge)) {
queue[queue_head++]=tmp_edge;
SETBIT(marked_edge,tmp_edge);
};
};
if(graph3->vert_degree[v1]==1) {
tmp_edge=graph3->first_edge[v1];
if(!GETBIT(marked_edge,tmp_edge)){
queue[queue_head++]=tmp_edge;
SETBIT(marked_edge,tmp_edge);
};
};
if(graph3->vert_degree[v2]==1){
tmp_edge=graph3->first_edge[v2];
if(!GETBIT(marked_edge,tmp_edge)){
queue[queue_head++]=tmp_edge;
SETBIT(marked_edge,tmp_edge);
};
};
};
free(marked_edge);
return (int)(queue_head-nedges);/* returns 0 if successful otherwies return negative number*/
};
static int bdz_mapping(cmph_config_t *mph, bdz_graph3_t* graph3, bdz_queue_t queue);
static void assigning(bdz_config_data_t *bdz, bdz_graph3_t* graph3, bdz_queue_t queue);
static void ranking(bdz_config_data_t *bdz);
static cmph_uint32 rank(cmph_uint32 b, cmph_uint32 * ranktable, cmph_uint8 * g, cmph_uint32 vertex);
bdz_config_data_t *bdz_config_new(void)
{
bdz_config_data_t *bdz;
bdz = (bdz_config_data_t *)malloc(sizeof(bdz_config_data_t));
assert(bdz);
memset(bdz, 0, sizeof(bdz_config_data_t));
bdz->hashfunc = CMPH_HASH_JENKINS;
bdz->g = NULL;
bdz->hl = NULL;
bdz->k = 0; //kth index in ranktable, $k = log_2(n=3r)/\varepsilon$
bdz->b = 7; // number of bits of k
bdz->ranktablesize = 0; //number of entries in ranktable, $n/k +1$
bdz->ranktable = NULL; // rank table
return bdz;
}
void bdz_config_destroy(cmph_config_t *mph)
{
bdz_config_data_t *data = (bdz_config_data_t *)mph->data;
DEBUGP("Destroying algorithm dependent data\n");
free(data);
}
void bdz_config_set_b(cmph_config_t *mph, cmph_uint32 b)
{
bdz_config_data_t *bdz = (bdz_config_data_t *)mph->data;
if (b <= 2 || b > 10) b = 7; // validating restrictions over parameter b.
bdz->b = (cmph_uint8)b;
DEBUGP("b: %u\n", b);
}
void bdz_config_set_hashfuncs(cmph_config_t *mph, CMPH_HASH *hashfuncs)
{
bdz_config_data_t *bdz = (bdz_config_data_t *)mph->data;
CMPH_HASH *hashptr = hashfuncs;
cmph_uint32 i = 0;
while(*hashptr != CMPH_HASH_COUNT)
{
if (i >= 1) break; //bdz only uses one linear hash function
bdz->hashfunc = *hashptr;
++i, ++hashptr;
}
}
cmph_t *bdz_new(cmph_config_t *mph, double c)
{
cmph_t *mphf = NULL;
bdz_data_t *bdzf = NULL;
cmph_uint32 iterations;
bdz_queue_t edges;
bdz_graph3_t graph3;
bdz_config_data_t *bdz = (bdz_config_data_t *)mph->data;
#ifdef CMPH_TIMING
double construction_time_begin = 0.0;
double construction_time = 0.0;
ELAPSED_TIME_IN_SECONDS(&construction_time_begin);
#endif
if (c == 0) c = 1.23; // validating restrictions over parameter c.
DEBUGP("c: %f\n", c);
bdz->m = mph->key_source->nkeys;
bdz->r = (cmph_uint32)ceil((c * mph->key_source->nkeys)/3);
if ((bdz->r % 2) == 0) bdz->r+=1;
bdz->n = 3*bdz->r;
bdz->k = (1U << bdz->b);
DEBUGP("b: %u -- k: %u\n", bdz->b, bdz->k);
bdz->ranktablesize = (cmph_uint32)ceil(bdz->n/(double)bdz->k);
DEBUGP("ranktablesize: %u\n", bdz->ranktablesize);
bdz_alloc_graph3(&graph3, bdz->m, bdz->n);
bdz_alloc_queue(&edges,bdz->m);
DEBUGP("Created hypergraph\n");
DEBUGP("m (edges): %u n (vertices): %u r: %u c: %f \n", bdz->m, bdz->n, bdz->r, c);
// Mapping step
iterations = 1000;
if (mph->verbosity)
{
fprintf(stderr, "Entering mapping step for mph creation of %u keys with graph sized %u\n", bdz->m, bdz->n);
}
while(1)
{
int ok;
DEBUGP("linear hash function \n");
bdz->hl = hash_state_new(bdz->hashfunc, 15);
ok = bdz_mapping(mph, &graph3, edges);
//ok = 0;
if (!ok)
{
--iterations;
hash_state_destroy(bdz->hl);
bdz->hl = NULL;
DEBUGP("%u iterations remaining\n", iterations);
if (mph->verbosity)
{
fprintf(stderr, "acyclic graph creation failure - %u iterations remaining\n", iterations);
}
if (iterations == 0) break;
}
else break;
}
if (iterations == 0)
{
bdz_free_queue(&edges);
bdz_free_graph3(&graph3);
return NULL;
}
bdz_partial_free_graph3(&graph3);
// Assigning step
if (mph->verbosity)
{
fprintf(stderr, "Entering assigning step for mph creation of %u keys with graph sized %u\n", bdz->m, bdz->n);
}
assigning(bdz, &graph3, edges);
bdz_free_queue(&edges);
bdz_free_graph3(&graph3);
if (mph->verbosity)
{
fprintf(stderr, "Entering ranking step for mph creation of %u keys with graph sized %u\n", bdz->m, bdz->n);
}
ranking(bdz);
#ifdef CMPH_TIMING
ELAPSED_TIME_IN_SECONDS(&construction_time);
#endif
mphf = (cmph_t *)malloc(sizeof(cmph_t));
mphf->algo = mph->algo;
bdzf = (bdz_data_t *)malloc(sizeof(bdz_data_t));
bdzf->g = bdz->g;
bdz->g = NULL; //transfer memory ownership
bdzf->hl = bdz->hl;
bdz->hl = NULL; //transfer memory ownership
bdzf->ranktable = bdz->ranktable;
bdz->ranktable = NULL; //transfer memory ownership
bdzf->ranktablesize = bdz->ranktablesize;
bdzf->k = bdz->k;
bdzf->b = bdz->b;
bdzf->n = bdz->n;
bdzf->m = bdz->m;
bdzf->r = bdz->r;
mphf->data = bdzf;
mphf->size = bdz->m;
DEBUGP("Successfully generated minimal perfect hash\n");
if (mph->verbosity)
{
fprintf(stderr, "Successfully generated minimal perfect hash function\n");
}
#ifdef CMPH_TIMING
register cmph_uint32 space_usage = bdz_packed_size(mphf)*8;
register cmph_uint32 keys_per_bucket = 1;
construction_time = construction_time - construction_time_begin;
fprintf(stdout, "%u\t%.2f\t%u\t%.4f\t%.4f\n", bdz->m, bdz->m/(double)bdz->n, keys_per_bucket, construction_time, space_usage/(double)bdz->m);
#endif
return mphf;
}
static int bdz_mapping(cmph_config_t *mph, bdz_graph3_t* graph3, bdz_queue_t queue)
{
cmph_uint32 e;
int cycles = 0;
cmph_uint32 hl[3];
bdz_config_data_t *bdz = (bdz_config_data_t *)mph->data;
bdz_init_graph3(graph3, bdz->m, bdz->n);
mph->key_source->rewind(mph->key_source->data);
for (e = 0; e < mph->key_source->nkeys; ++e)
{
cmph_uint32 h0, h1, h2;
cmph_uint32 keylen;
char *key = NULL;
mph->key_source->read(mph->key_source->data, &key, &keylen);
hash_vector(bdz->hl, key, keylen,hl);
h0 = hl[0] % bdz->r;
h1 = hl[1] % bdz->r + bdz->r;
h2 = hl[2] % bdz->r + (bdz->r << 1);
mph->key_source->dispose(mph->key_source->data, key, keylen);
bdz_add_edge(graph3,h0,h1,h2);
}
cycles = bdz_generate_queue(bdz->m, bdz->n, queue, graph3);
return (cycles == 0);
}
static void assigning(bdz_config_data_t *bdz, bdz_graph3_t* graph3, bdz_queue_t queue)
{
cmph_uint32 i;
cmph_uint32 nedges=graph3->nedges;
cmph_uint32 curr_edge;
cmph_uint32 v0,v1,v2;
cmph_uint8 * marked_vertices =malloc((size_t)(bdz->n >> 3) + 1);
cmph_uint32 sizeg = (cmph_uint32)ceil(bdz->n/4.0);
bdz->g = (cmph_uint8 *)calloc((size_t)(sizeg), sizeof(cmph_uint8));
memset(marked_vertices, 0, (size_t)(bdz->n >> 3) + 1);
memset(bdz->g, 0xff, (size_t)(sizeg));
for(i=nedges-1;i+1>0;i--){
curr_edge=queue[i];
v0=graph3->edges[curr_edge].vertices[0];
v1=graph3->edges[curr_edge].vertices[1];
v2=graph3->edges[curr_edge].vertices[2];
DEBUGP("B:%u %u %u -- %u %u %u\n", v0, v1, v2, GETVALUE(bdz->g, v0), GETVALUE(bdz->g, v1), GETVALUE(bdz->g, v2));
if(!GETBIT(marked_vertices, v0)){
if(!GETBIT(marked_vertices,v1))
{
SETVALUE1(bdz->g, v1, UNASSIGNED);
SETBIT(marked_vertices, v1);
}
if(!GETBIT(marked_vertices,v2))
{
SETVALUE1(bdz->g, v2, UNASSIGNED);
SETBIT(marked_vertices, v2);
}
SETVALUE1(bdz->g, v0, (6-(GETVALUE(bdz->g, v1) + GETVALUE(bdz->g,v2)))%3);
SETBIT(marked_vertices, v0);
} else if(!GETBIT(marked_vertices, v1)) {
if(!GETBIT(marked_vertices, v2))
{
SETVALUE1(bdz->g, v2, UNASSIGNED);
SETBIT(marked_vertices, v2);
}
SETVALUE1(bdz->g, v1, (7-(GETVALUE(bdz->g, v0)+GETVALUE(bdz->g, v2)))%3);
SETBIT(marked_vertices, v1);
}else {
SETVALUE1(bdz->g, v2, (8-(GETVALUE(bdz->g,v0)+GETVALUE(bdz->g, v1)))%3);
SETBIT(marked_vertices, v2);
}
DEBUGP("A:%u %u %u -- %u %u %u\n", v0, v1, v2, GETVALUE(bdz->g, v0), GETVALUE(bdz->g, v1), GETVALUE(bdz->g, v2));
};
free(marked_vertices);
}
static void ranking(bdz_config_data_t *bdz)
{
cmph_uint32 i, j, offset = 0U, count = 0U, size = (bdz->k >> 2U), nbytes_total = (cmph_uint32)ceil(bdz->n/4.0), nbytes;
bdz->ranktable = (cmph_uint32 *)calloc((size_t)bdz->ranktablesize, sizeof(cmph_uint32));
// ranktable computation
bdz->ranktable[0] = 0;
i = 1;
while(1)
{
if(i == bdz->ranktablesize) break;
nbytes = size < nbytes_total? size : nbytes_total;
for(j = 0; j < nbytes; j++)
{
count += bdz_lookup_table[*(bdz->g + offset + j)];
}
bdz->ranktable[i] = count;
offset += nbytes;
nbytes_total -= size;
i++;
}
}
int bdz_dump(cmph_t *mphf, FILE *fd)
{
char *buf = NULL;
cmph_uint32 buflen;
register size_t nbytes;
bdz_data_t *data = (bdz_data_t *)mphf->data;
cmph_uint32 sizeg;
#ifdef DEBUG
cmph_uint32 i;
#endif
__cmph_dump(mphf, fd);
hash_state_dump(data->hl, &buf, &buflen);
DEBUGP("Dumping hash state with %u bytes to disk\n", buflen);
nbytes = fwrite(&buflen, sizeof(cmph_uint32), (size_t)1, fd);
nbytes = fwrite(buf, (size_t)buflen, (size_t)1, fd);
free(buf);
nbytes = fwrite(&(data->n), sizeof(cmph_uint32), (size_t)1, fd);
nbytes = fwrite(&(data->m), sizeof(cmph_uint32), (size_t)1, fd);
nbytes = fwrite(&(data->r), sizeof(cmph_uint32), (size_t)1, fd);
sizeg = (cmph_uint32)ceil(data->n/4.0);
nbytes = fwrite(data->g, sizeof(cmph_uint8)*sizeg, (size_t)1, fd);
nbytes = fwrite(&(data->k), sizeof(cmph_uint32), (size_t)1, fd);
nbytes = fwrite(&(data->b), sizeof(cmph_uint8), (size_t)1, fd);
nbytes = fwrite(&(data->ranktablesize), sizeof(cmph_uint32), (size_t)1, fd);
nbytes = fwrite(data->ranktable, sizeof(cmph_uint32)*(data->ranktablesize), (size_t)1, fd);
if (nbytes == 0 && ferror(fd)) {
fprintf(stderr, "ERROR: %s\n", strerror(errno));
return 0;
}
#ifdef DEBUG
fprintf(stderr, "G: ");
for (i = 0; i < data->n; ++i) fprintf(stderr, "%u ", GETVALUE(data->g, i));
fprintf(stderr, "\n");
#endif
return 1;
}
void bdz_load(FILE *f, cmph_t *mphf)
{
char *buf = NULL;
cmph_uint32 buflen, sizeg;
register size_t nbytes;
bdz_data_t *bdz = (bdz_data_t *)malloc(sizeof(bdz_data_t));
#ifdef DEBUG
cmph_uint32 i = 0;
#endif
DEBUGP("Loading bdz mphf\n");
mphf->data = bdz;
nbytes = fread(&buflen, sizeof(cmph_uint32), (size_t)1, f);
DEBUGP("Hash state has %u bytes\n", buflen);
buf = (char *)malloc((size_t)buflen);
nbytes = fread(buf, (size_t)buflen, (size_t)1, f);
bdz->hl = hash_state_load(buf, buflen);
free(buf);
DEBUGP("Reading m and n\n");
nbytes = fread(&(bdz->n), sizeof(cmph_uint32), (size_t)1, f);
nbytes = fread(&(bdz->m), sizeof(cmph_uint32), (size_t)1, f);
nbytes = fread(&(bdz->r), sizeof(cmph_uint32), (size_t)1, f);
sizeg = (cmph_uint32)ceil(bdz->n/4.0);
bdz->g = (cmph_uint8 *)calloc((size_t)(sizeg), sizeof(cmph_uint8));
nbytes = fread(bdz->g, sizeg*sizeof(cmph_uint8), (size_t)1, f);
nbytes = fread(&(bdz->k), sizeof(cmph_uint32), (size_t)1, f);
nbytes = fread(&(bdz->b), sizeof(cmph_uint8), (size_t)1, f);
nbytes = fread(&(bdz->ranktablesize), sizeof(cmph_uint32), (size_t)1, f);
bdz->ranktable = (cmph_uint32 *)calloc((size_t)bdz->ranktablesize, sizeof(cmph_uint32));
nbytes = fread(bdz->ranktable, sizeof(cmph_uint32)*(bdz->ranktablesize), (size_t)1, f);
if (nbytes == 0 && ferror(f)) {
fprintf(stderr, "ERROR: %s\n", strerror(errno));
return;
}
#ifdef DEBUG
i = 0;
fprintf(stderr, "G: ");
for (i = 0; i < bdz->n; ++i) fprintf(stderr, "%u ", GETVALUE(bdz->g,i));
fprintf(stderr, "\n");
#endif
return;
}
/*
static cmph_uint32 bdz_search_ph(cmph_t *mphf, const char *key, cmph_uint32 keylen)
{
bdz_data_t *bdz = mphf->data;
cmph_uint32 hl[3];
hash_vector(bdz->hl, key, keylen, hl);
cmph_uint32 vertex;
hl[0] = hl[0] % bdz->r;
hl[1] = hl[1] % bdz->r + bdz->r;
hl[2] = hl[2] % bdz->r + (bdz->r << 1);
vertex = hl[(GETVALUE(bdz->g, hl[0]) + GETVALUE(bdz->g, hl[1]) + GETVALUE(bdz->g, hl[2])) % 3];
return vertex;
}
*/
static inline cmph_uint32 rank(cmph_uint32 b, cmph_uint32 * ranktable, cmph_uint8 * g, cmph_uint32 vertex)
{
register cmph_uint32 index = vertex >> b;
register cmph_uint32 base_rank = ranktable[index];
register cmph_uint32 beg_idx_v = index << b;
register cmph_uint32 beg_idx_b = beg_idx_v >> 2;
register cmph_uint32 end_idx_b = vertex >> 2;
while(beg_idx_b < end_idx_b)
{
base_rank += bdz_lookup_table[*(g + beg_idx_b++)];
}
beg_idx_v = beg_idx_b << 2;
while(beg_idx_v < vertex)
{
if(GETVALUE(g, beg_idx_v) != UNASSIGNED) base_rank++;
beg_idx_v++;
}
return base_rank;
}
cmph_uint32 bdz_search(cmph_t *mphf, const char *key, cmph_uint32 keylen)
{
register cmph_uint32 vertex;
register bdz_data_t *bdz = mphf->data;
cmph_uint32 hl[3];
hash_vector(bdz->hl, key, keylen, hl);
hl[0] = hl[0] % bdz->r;
hl[1] = hl[1] % bdz->r + bdz->r;
hl[2] = hl[2] % bdz->r + (bdz->r << 1);
vertex = hl[(GETVALUE(bdz->g, hl[0]) + GETVALUE(bdz->g, hl[1]) + GETVALUE(bdz->g, hl[2])) % 3];
return rank(bdz->b, bdz->ranktable, bdz->g, vertex);
}
void bdz_destroy(cmph_t *mphf)
{
bdz_data_t *data = (bdz_data_t *)mphf->data;
free(data->g);
hash_state_destroy(data->hl);
free(data->ranktable);
free(data);
free(mphf);
}
/** \fn void bdz_pack(cmph_t *mphf, void *packed_mphf);
* \brief Support the ability to pack a perfect hash function into a preallocated contiguous memory space pointed by packed_mphf.
* \param mphf pointer to the resulting mphf
* \param packed_mphf pointer to the contiguous memory area used to store the resulting mphf. The size of packed_mphf must be at least cmph_packed_size()
*/
void bdz_pack(cmph_t *mphf, void *packed_mphf)
{
bdz_data_t *data = (bdz_data_t *)mphf->data;
cmph_uint8 * ptr = packed_mphf;
cmph_uint32 sizeg;
// packing hl type
CMPH_HASH hl_type = hash_get_type(data->hl);
*((cmph_uint32 *) ptr) = hl_type;
ptr += sizeof(cmph_uint32);
// packing hl
hash_state_pack(data->hl, ptr);
ptr += hash_state_packed_size(hl_type);
// packing r
*((cmph_uint32 *) ptr) = data->r;
ptr += sizeof(data->r);
// packing ranktablesize
*((cmph_uint32 *) ptr) = data->ranktablesize;
ptr += sizeof(data->ranktablesize);
// packing ranktable
memcpy(ptr, data->ranktable, sizeof(cmph_uint32)*(data->ranktablesize));
ptr += sizeof(cmph_uint32)*(data->ranktablesize);
// packing b
*ptr++ = data->b;
// packing g
sizeg = (cmph_uint32)ceil(data->n/4.0);
memcpy(ptr, data->g, sizeof(cmph_uint8)*sizeg);
}
/** \fn cmph_uint32 bdz_packed_size(cmph_t *mphf);
* \brief Return the amount of space needed to pack mphf.
* \param mphf pointer to a mphf
* \return the size of the packed function or zero for failures
*/
cmph_uint32 bdz_packed_size(cmph_t *mphf)
{
bdz_data_t *data = (bdz_data_t *)mphf->data;
CMPH_HASH hl_type = hash_get_type(data->hl);
return (cmph_uint32)(sizeof(CMPH_ALGO) + hash_state_packed_size(hl_type) + 3*sizeof(cmph_uint32) + sizeof(cmph_uint32)*(data->ranktablesize) + sizeof(cmph_uint8) + sizeof(cmph_uint8)* (cmph_uint32)(ceil(data->n/4.0)));
}
/** cmph_uint32 bdz_search(void *packed_mphf, const char *key, cmph_uint32 keylen);
* \brief Use the packed mphf to do a search.
* \param packed_mphf pointer to the packed mphf
* \param key key to be hashed
* \param keylen key legth in bytes
* \return The mphf value
*/
cmph_uint32 bdz_search_packed(void *packed_mphf, const char *key, cmph_uint32 keylen)
{
register cmph_uint32 vertex;
register CMPH_HASH hl_type = *(cmph_uint32 *)packed_mphf;
register cmph_uint8 *hl_ptr = (cmph_uint8 *)(packed_mphf) + 4;
register cmph_uint32 *ranktable = (cmph_uint32*)(hl_ptr + hash_state_packed_size(hl_type));
register cmph_uint32 r = *ranktable++;
register cmph_uint32 ranktablesize = *ranktable++;
register cmph_uint8 * g = (cmph_uint8 *)(ranktable + ranktablesize);
register cmph_uint8 b = *g++;
cmph_uint32 hl[3];
hash_vector_packed(hl_ptr, hl_type, key, keylen, hl);
hl[0] = hl[0] % r;
hl[1] = hl[1] % r + r;
hl[2] = hl[2] % r + (r << 1);
vertex = hl[(GETVALUE(g, hl[0]) + GETVALUE(g, hl[1]) + GETVALUE(g, hl[2])) % 3];
return rank(b, ranktable, g, vertex);
}

43
girepository/cmph/bdz.h Normal file
View File

@ -0,0 +1,43 @@
#ifndef __CMPH_BDZ_H__
#define __CMPH_BDZ_H__
#include "cmph.h"
typedef struct __bdz_data_t bdz_data_t;
typedef struct __bdz_config_data_t bdz_config_data_t;
bdz_config_data_t *bdz_config_new(void);
void bdz_config_set_hashfuncs(cmph_config_t *mph, CMPH_HASH *hashfuncs);
void bdz_config_destroy(cmph_config_t *mph);
void bdz_config_set_b(cmph_config_t *mph, cmph_uint32 b);
cmph_t *bdz_new(cmph_config_t *mph, double c);
void bdz_load(FILE *f, cmph_t *mphf);
int bdz_dump(cmph_t *mphf, FILE *f);
void bdz_destroy(cmph_t *mphf);
cmph_uint32 bdz_search(cmph_t *mphf, const char *key, cmph_uint32 keylen);
/** \fn void bdz_pack(cmph_t *mphf, void *packed_mphf);
* \brief Support the ability to pack a perfect hash function into a preallocated contiguous memory space pointed by packed_mphf.
* \param mphf pointer to the resulting mphf
* \param packed_mphf pointer to the contiguous memory area used to store the resulting mphf. The size of packed_mphf must be at least cmph_packed_size()
*/
void bdz_pack(cmph_t *mphf, void *packed_mphf);
/** \fn cmph_uint32 bdz_packed_size(cmph_t *mphf);
* \brief Return the amount of space needed to pack mphf.
* \param mphf pointer to a mphf
* \return the size of the packed function or zero for failures
*/
cmph_uint32 bdz_packed_size(cmph_t *mphf);
/** cmph_uint32 bdz_search(void *packed_mphf, const char *key, cmph_uint32 keylen);
* \brief Use the packed mphf to do a search.
* \param packed_mphf pointer to the packed mphf
* \param key key to be hashed
* \param keylen key legth in bytes
* \return The mphf value
*/
cmph_uint32 bdz_search_packed(void *packed_mphf, const char *key, cmph_uint32 keylen);
#endif

View File

@ -0,0 +1,33 @@
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void help(char * prname)
{
fprintf(stderr, "USE: %s <n><wordsizeinbits>\n", prname);
exit(1);
}
int main(int argc, char ** argv)
{
if(argc != 3) help(argv[0]);
int n = atoi(argv[1]);
int wordsize = (atoi(argv[2]) >> 1);
int i, j, n_assigned;
for(i = 0; i < n; i++)
{
int num = i;
n_assigned = 0;
for(j = 0; j < wordsize; j++)
{
if ((num & 0x0003) != 3)
{
n_assigned++;
//fprintf(stderr, "num:%d\n", num);
}
num = num >> 2;
}
if(i%16 == 0) fprintf(stderr, "\n");
fprintf(stderr, "%d, ", n_assigned);
}
fprintf(stderr, "\n");
}

633
girepository/cmph/bdz_ph.c Normal file
View File

@ -0,0 +1,633 @@
#include "bdz_ph.h"
#include "cmph_structs.h"
#include "bdz_structs_ph.h"
#include "hash.h"
#include "bitbool.h"
#include <math.h>
#include <stdlib.h>
#include <stdio.h>
#include <assert.h>
#include <string.h>
#include <errno.h>
//#define DEBUG
#include "debug.h"
#define UNASSIGNED 3
#define NULL_EDGE 0xffffffff
static cmph_uint8 pow3_table[5] = {1,3,9,27,81};
static cmph_uint8 lookup_table[5][256] = {
{0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0},
{0, 0, 0, 1, 1, 1, 2, 2, 2, 0, 0, 0, 1, 1, 1, 2, 2, 2, 0, 0, 0, 1, 1, 1, 2, 2, 2, 0, 0, 0, 1, 1, 1, 2, 2, 2, 0, 0, 0, 1, 1, 1, 2, 2, 2, 0, 0, 0, 1, 1, 1, 2, 2, 2, 0, 0, 0, 1, 1, 1, 2, 2, 2, 0, 0, 0, 1, 1, 1, 2, 2, 2, 0, 0, 0, 1, 1, 1, 2, 2, 2, 0, 0, 0, 1, 1, 1, 2, 2, 2, 0, 0, 0, 1, 1, 1, 2, 2, 2, 0, 0, 0, 1, 1, 1, 2, 2, 2, 0, 0, 0, 1, 1, 1, 2, 2, 2, 0, 0, 0, 1, 1, 1, 2, 2, 2, 0, 0, 0, 1, 1, 1, 2, 2, 2, 0, 0, 0, 1, 1, 1, 2, 2, 2, 0, 0, 0, 1, 1, 1, 2, 2, 2, 0, 0, 0, 1, 1, 1, 2, 2, 2, 0, 0, 0, 1, 1, 1, 2, 2, 2, 0, 0, 0, 1, 1, 1, 2, 2, 2, 0, 0, 0, 1, 1, 1, 2, 2, 2, 0, 0, 0, 1, 1, 1, 2, 2, 2, 0, 0, 0, 1, 1, 1, 2, 2, 2, 0, 0, 0, 1, 1, 1, 2, 2, 2, 0, 0, 0, 1, 1, 1, 2, 2, 2, 0, 0, 0, 1, 1, 1, 2, 2, 2, 0, 0, 0, 1, 1, 1, 2, 2, 2, 0, 0, 0, 1, 1, 1, 2, 2, 2, 0, 0, 0, 1},
{0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1},
{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
};
typedef struct
{
cmph_uint32 vertices[3];
cmph_uint32 next_edges[3];
}bdz_ph_edge_t;
typedef cmph_uint32 * bdz_ph_queue_t;
static void bdz_ph_alloc_queue(bdz_ph_queue_t * queuep, cmph_uint32 nedges)
{
(*queuep)=malloc(nedges*sizeof(cmph_uint32));
};
static void bdz_ph_free_queue(bdz_ph_queue_t * queue)
{
free(*queue);
};
typedef struct
{
cmph_uint32 nedges;
bdz_ph_edge_t * edges;
cmph_uint32 * first_edge;
cmph_uint8 * vert_degree;
}bdz_ph_graph3_t;
static void bdz_ph_alloc_graph3(bdz_ph_graph3_t * graph3, cmph_uint32 nedges, cmph_uint32 nvertices)
{
graph3->edges=malloc(nedges*sizeof(bdz_ph_edge_t));
graph3->first_edge=malloc(nvertices*sizeof(cmph_uint32));
graph3->vert_degree=malloc((size_t)nvertices);
};
static void bdz_ph_init_graph3(bdz_ph_graph3_t * graph3, cmph_uint32 nedges, cmph_uint32 nvertices)
{
memset(graph3->first_edge,0xff,nvertices*sizeof(cmph_uint32));
memset(graph3->vert_degree,0,(size_t)nvertices);
graph3->nedges=0;
};
static void bdz_ph_free_graph3(bdz_ph_graph3_t *graph3)
{
free(graph3->edges);
free(graph3->first_edge);
free(graph3->vert_degree);
};
static void bdz_ph_partial_free_graph3(bdz_ph_graph3_t *graph3)
{
free(graph3->first_edge);
free(graph3->vert_degree);
graph3->first_edge = NULL;
graph3->vert_degree = NULL;
};
static void bdz_ph_add_edge(bdz_ph_graph3_t * graph3, cmph_uint32 v0, cmph_uint32 v1, cmph_uint32 v2)
{
graph3->edges[graph3->nedges].vertices[0]=v0;
graph3->edges[graph3->nedges].vertices[1]=v1;
graph3->edges[graph3->nedges].vertices[2]=v2;
graph3->edges[graph3->nedges].next_edges[0]=graph3->first_edge[v0];
graph3->edges[graph3->nedges].next_edges[1]=graph3->first_edge[v1];
graph3->edges[graph3->nedges].next_edges[2]=graph3->first_edge[v2];
graph3->first_edge[v0]=graph3->first_edge[v1]=graph3->first_edge[v2]=graph3->nedges;
graph3->vert_degree[v0]++;
graph3->vert_degree[v1]++;
graph3->vert_degree[v2]++;
graph3->nedges++;
};
static void bdz_ph_dump_graph(bdz_ph_graph3_t* graph3, cmph_uint32 nedges, cmph_uint32 nvertices)
{
cmph_uint32 i;
for(i=0;i<nedges;i++){
printf("\nedge %d %d %d %d ",i,graph3->edges[i].vertices[0],
graph3->edges[i].vertices[1],graph3->edges[i].vertices[2]);
printf(" nexts %d %d %d",graph3->edges[i].next_edges[0],
graph3->edges[i].next_edges[1],graph3->edges[i].next_edges[2]);
};
for(i=0;i<nvertices;i++){
printf("\nfirst for vertice %d %d ",i,graph3->first_edge[i]);
};
};
static void bdz_ph_remove_edge(bdz_ph_graph3_t * graph3, cmph_uint32 curr_edge)
{
cmph_uint32 i,j=0,vert,edge1,edge2;
for(i=0;i<3;i++){
vert=graph3->edges[curr_edge].vertices[i];
edge1=graph3->first_edge[vert];
edge2=NULL_EDGE;
while(edge1!=curr_edge&&edge1!=NULL_EDGE){
edge2=edge1;
if(graph3->edges[edge1].vertices[0]==vert){
j=0;
} else if(graph3->edges[edge1].vertices[1]==vert){
j=1;
} else
j=2;
edge1=graph3->edges[edge1].next_edges[j];
};
if(edge1==NULL_EDGE){
printf("\nerror remove edge %d dump graph",curr_edge);
bdz_ph_dump_graph(graph3,graph3->nedges,graph3->nedges+graph3->nedges/4);
exit(-1);
};
if(edge2!=NULL_EDGE){
graph3->edges[edge2].next_edges[j] =
graph3->edges[edge1].next_edges[i];
} else
graph3->first_edge[vert]=
graph3->edges[edge1].next_edges[i];
graph3->vert_degree[vert]--;
};
};
static int bdz_ph_generate_queue(cmph_uint32 nedges, cmph_uint32 nvertices, bdz_ph_queue_t queue, bdz_ph_graph3_t* graph3)
{
cmph_uint32 i,v0,v1,v2;
cmph_uint32 queue_head=0,queue_tail=0;
cmph_uint32 curr_edge;
cmph_uint32 tmp_edge;
cmph_uint8 * marked_edge =malloc((size_t)(nedges >> 3) + 1);
memset(marked_edge, 0, (size_t)(nedges >> 3) + 1);
for(i=0;i<nedges;i++){
v0=graph3->edges[i].vertices[0];
v1=graph3->edges[i].vertices[1];
v2=graph3->edges[i].vertices[2];
if(graph3->vert_degree[v0]==1 ||
graph3->vert_degree[v1]==1 ||
graph3->vert_degree[v2]==1){
if(!GETBIT(marked_edge,i)) {
queue[queue_head++]=i;
SETBIT(marked_edge,i);
}
};
};
while(queue_tail!=queue_head){
curr_edge=queue[queue_tail++];
bdz_ph_remove_edge(graph3,curr_edge);
v0=graph3->edges[curr_edge].vertices[0];
v1=graph3->edges[curr_edge].vertices[1];
v2=graph3->edges[curr_edge].vertices[2];
if(graph3->vert_degree[v0]==1 ) {
tmp_edge=graph3->first_edge[v0];
if(!GETBIT(marked_edge,tmp_edge)) {
queue[queue_head++]=tmp_edge;
SETBIT(marked_edge,tmp_edge);
};
};
if(graph3->vert_degree[v1]==1) {
tmp_edge=graph3->first_edge[v1];
if(!GETBIT(marked_edge,tmp_edge)){
queue[queue_head++]=tmp_edge;
SETBIT(marked_edge,tmp_edge);
};
};
if(graph3->vert_degree[v2]==1){
tmp_edge=graph3->first_edge[v2];
if(!GETBIT(marked_edge,tmp_edge)){
queue[queue_head++]=tmp_edge;
SETBIT(marked_edge,tmp_edge);
};
};
};
free(marked_edge);
return (int)queue_head - (int)nedges;/* returns 0 if successful otherwies return negative number*/
};
static int bdz_ph_mapping(cmph_config_t *mph, bdz_ph_graph3_t* graph3, bdz_ph_queue_t queue);
static void assigning(bdz_ph_config_data_t *bdz_ph, bdz_ph_graph3_t* graph3, bdz_ph_queue_t queue);
static void bdz_ph_optimization(bdz_ph_config_data_t *bdz_ph);
bdz_ph_config_data_t *bdz_ph_config_new(void)
{
bdz_ph_config_data_t *bdz_ph;
bdz_ph = (bdz_ph_config_data_t *)malloc(sizeof(bdz_ph_config_data_t));
assert(bdz_ph);
memset(bdz_ph, 0, sizeof(bdz_ph_config_data_t));
bdz_ph->hashfunc = CMPH_HASH_JENKINS;
bdz_ph->g = NULL;
bdz_ph->hl = NULL;
return bdz_ph;
}
void bdz_ph_config_destroy(cmph_config_t *mph)
{
bdz_ph_config_data_t *data = (bdz_ph_config_data_t *)mph->data;
DEBUGP("Destroying algorithm dependent data\n");
free(data);
}
void bdz_ph_config_set_hashfuncs(cmph_config_t *mph, CMPH_HASH *hashfuncs)
{
bdz_ph_config_data_t *bdz_ph = (bdz_ph_config_data_t *)mph->data;
CMPH_HASH *hashptr = hashfuncs;
cmph_uint32 i = 0;
while(*hashptr != CMPH_HASH_COUNT)
{
if (i >= 1) break; //bdz_ph only uses one linear hash function
bdz_ph->hashfunc = *hashptr;
++i, ++hashptr;
}
}
cmph_t *bdz_ph_new(cmph_config_t *mph, double c)
{
cmph_t *mphf = NULL;
bdz_ph_data_t *bdz_phf = NULL;
cmph_uint32 iterations;
bdz_ph_queue_t edges;
bdz_ph_graph3_t graph3;
bdz_ph_config_data_t *bdz_ph = (bdz_ph_config_data_t *)mph->data;
#ifdef CMPH_TIMING
double construction_time_begin = 0.0;
double construction_time = 0.0;
ELAPSED_TIME_IN_SECONDS(&construction_time_begin);
#endif
if (c == 0) c = 1.23; // validating restrictions over parameter c.
DEBUGP("c: %f\n", c);
bdz_ph->m = mph->key_source->nkeys;
bdz_ph->r = (cmph_uint32)ceil((c * mph->key_source->nkeys)/3);
if ((bdz_ph->r % 2) == 0) bdz_ph->r += 1;
bdz_ph->n = 3*bdz_ph->r;
bdz_ph_alloc_graph3(&graph3, bdz_ph->m, bdz_ph->n);
bdz_ph_alloc_queue(&edges,bdz_ph->m);
DEBUGP("Created hypergraph\n");
DEBUGP("m (edges): %u n (vertices): %u r: %u c: %f \n", bdz_ph->m, bdz_ph->n, bdz_ph->r, c);
// Mapping step
iterations = 100;
if (mph->verbosity)
{
fprintf(stderr, "Entering mapping step for mph creation of %u keys with graph sized %u\n", bdz_ph->m, bdz_ph->n);
}
while(1)
{
int ok;
DEBUGP("linear hash function \n");
bdz_ph->hl = hash_state_new(bdz_ph->hashfunc, 15);
ok = bdz_ph_mapping(mph, &graph3, edges);
if (!ok)
{
--iterations;
hash_state_destroy(bdz_ph->hl);
bdz_ph->hl = NULL;
DEBUGP("%u iterations remaining\n", iterations);
if (mph->verbosity)
{
fprintf(stderr, "acyclic graph creation failure - %u iterations remaining\n", iterations);
}
if (iterations == 0) break;
}
else break;
}
if (iterations == 0)
{
// free(bdz_ph->g);
bdz_ph_free_queue(&edges);
bdz_ph_free_graph3(&graph3);
return NULL;
}
bdz_ph_partial_free_graph3(&graph3);
// Assigning step
if (mph->verbosity)
{
fprintf(stderr, "Entering assigning step for mph creation of %u keys with graph sized %u\n", bdz_ph->m, bdz_ph->n);
}
assigning(bdz_ph, &graph3, edges);
bdz_ph_free_queue(&edges);
bdz_ph_free_graph3(&graph3);
if (mph->verbosity)
{
fprintf(stderr, "Starting optimization step\n");
}
bdz_ph_optimization(bdz_ph);
#ifdef CMPH_TIMING
ELAPSED_TIME_IN_SECONDS(&construction_time);
#endif
mphf = (cmph_t *)malloc(sizeof(cmph_t));
mphf->algo = mph->algo;
bdz_phf = (bdz_ph_data_t *)malloc(sizeof(bdz_ph_data_t));
bdz_phf->g = bdz_ph->g;
bdz_ph->g = NULL; //transfer memory ownership
bdz_phf->hl = bdz_ph->hl;
bdz_ph->hl = NULL; //transfer memory ownership
bdz_phf->n = bdz_ph->n;
bdz_phf->m = bdz_ph->m;
bdz_phf->r = bdz_ph->r;
mphf->data = bdz_phf;
mphf->size = bdz_ph->n;
DEBUGP("Successfully generated minimal perfect hash\n");
if (mph->verbosity)
{
fprintf(stderr, "Successfully generated minimal perfect hash function\n");
}
#ifdef CMPH_TIMING
register cmph_uint32 space_usage = bdz_ph_packed_size(mphf)*8;
register cmph_uint32 keys_per_bucket = 1;
construction_time = construction_time - construction_time_begin;
fprintf(stdout, "%u\t%.2f\t%u\t%.4f\t%.4f\n", bdz_ph->m, bdz_ph->m/(double)bdz_ph->n, keys_per_bucket, construction_time, space_usage/(double)bdz_ph->m);
#endif
return mphf;
}
static int bdz_ph_mapping(cmph_config_t *mph, bdz_ph_graph3_t* graph3, bdz_ph_queue_t queue)
{
cmph_uint32 e;
int cycles = 0;
cmph_uint32 hl[3];
bdz_ph_config_data_t *bdz_ph = (bdz_ph_config_data_t *)mph->data;
bdz_ph_init_graph3(graph3, bdz_ph->m, bdz_ph->n);
mph->key_source->rewind(mph->key_source->data);
for (e = 0; e < mph->key_source->nkeys; ++e)
{
cmph_uint32 h0, h1, h2;
cmph_uint32 keylen;
char *key = NULL;
mph->key_source->read(mph->key_source->data, &key, &keylen);
hash_vector(bdz_ph->hl, key, keylen, hl);
h0 = hl[0] % bdz_ph->r;
h1 = hl[1] % bdz_ph->r + bdz_ph->r;
h2 = hl[2] % bdz_ph->r + (bdz_ph->r << 1);
mph->key_source->dispose(mph->key_source->data, key, keylen);
bdz_ph_add_edge(graph3,h0,h1,h2);
}
cycles = bdz_ph_generate_queue(bdz_ph->m, bdz_ph->n, queue, graph3);
return (cycles == 0);
}
static void assigning(bdz_ph_config_data_t *bdz_ph, bdz_ph_graph3_t* graph3, bdz_ph_queue_t queue)
{
cmph_uint32 i;
cmph_uint32 nedges=graph3->nedges;
cmph_uint32 curr_edge;
cmph_uint32 v0,v1,v2;
cmph_uint8 * marked_vertices =malloc((size_t)(bdz_ph->n >> 3) + 1);
cmph_uint32 sizeg = (cmph_uint32)ceil(bdz_ph->n/4.0);
bdz_ph->g = (cmph_uint8 *)calloc((size_t)sizeg, sizeof(cmph_uint8));
memset(marked_vertices, 0, (size_t)(bdz_ph->n >> 3) + 1);
//memset(bdz_ph->g, 0xff, sizeg);
for(i=nedges-1;i+1>=1;i--){
curr_edge=queue[i];
v0=graph3->edges[curr_edge].vertices[0];
v1=graph3->edges[curr_edge].vertices[1];
v2=graph3->edges[curr_edge].vertices[2];
DEBUGP("B:%u %u %u -- %u %u %u\n", v0, v1, v2, GETVALUE(bdz_ph->g, v0), GETVALUE(bdz_ph->g, v1), GETVALUE(bdz_ph->g, v2));
if(!GETBIT(marked_vertices, v0)){
if(!GETBIT(marked_vertices,v1))
{
//SETVALUE(bdz_ph->g, v1, UNASSIGNED);
SETBIT(marked_vertices, v1);
}
if(!GETBIT(marked_vertices,v2))
{
//SETVALUE(bdz_ph->g, v2, UNASSIGNED);
SETBIT(marked_vertices, v2);
}
SETVALUE0(bdz_ph->g, v0, (6-(GETVALUE(bdz_ph->g, v1) + GETVALUE(bdz_ph->g,v2)))%3);
SETBIT(marked_vertices, v0);
} else if(!GETBIT(marked_vertices, v1)) {
if(!GETBIT(marked_vertices, v2))
{
//SETVALUE(bdz_ph->g, v2, UNASSIGNED);
SETBIT(marked_vertices, v2);
}
SETVALUE0(bdz_ph->g, v1, (7 - (GETVALUE(bdz_ph->g, v0)+GETVALUE(bdz_ph->g, v2)))%3);
SETBIT(marked_vertices, v1);
}else {
SETVALUE0(bdz_ph->g, v2, (8-(GETVALUE(bdz_ph->g,v0)+GETVALUE(bdz_ph->g, v1)))%3);
SETBIT(marked_vertices, v2);
}
DEBUGP("A:%u %u %u -- %u %u %u\n", v0, v1, v2, GETVALUE(bdz_ph->g, v0), GETVALUE(bdz_ph->g, v1), GETVALUE(bdz_ph->g, v2));
};
free(marked_vertices);
}
static void bdz_ph_optimization(bdz_ph_config_data_t *bdz_ph)
{
cmph_uint32 i;
cmph_uint8 byte = 0;
cmph_uint32 sizeg = (cmph_uint32)ceil(bdz_ph->n/5.0);
cmph_uint8 * new_g = (cmph_uint8 *)calloc((size_t)sizeg, sizeof(cmph_uint8));
cmph_uint8 value;
cmph_uint32 idx;
for(i = 0; i < bdz_ph->n; i++)
{
idx = i/5;
byte = new_g[idx];
value = GETVALUE(bdz_ph->g, i);
byte = (cmph_uint8) (byte + value*pow3_table[i%5U]);
new_g[idx] = byte;
}
free(bdz_ph->g);
bdz_ph->g = new_g;
}
int bdz_ph_dump(cmph_t *mphf, FILE *fd)
{
char *buf = NULL;
cmph_uint32 buflen;
cmph_uint32 sizeg = 0;
register size_t nbytes;
bdz_ph_data_t *data = (bdz_ph_data_t *)mphf->data;
#ifdef DEBUG
cmph_uint32 i;
#endif
__cmph_dump(mphf, fd);
hash_state_dump(data->hl, &buf, &buflen);
DEBUGP("Dumping hash state with %u bytes to disk\n", buflen);
nbytes = fwrite(&buflen, sizeof(cmph_uint32), (size_t)1, fd);
nbytes = fwrite(buf, (size_t)buflen, (size_t)1, fd);
free(buf);
nbytes = fwrite(&(data->n), sizeof(cmph_uint32), (size_t)1, fd);
nbytes = fwrite(&(data->m), sizeof(cmph_uint32), (size_t)1, fd);
nbytes = fwrite(&(data->r), sizeof(cmph_uint32), (size_t)1, fd);
sizeg = (cmph_uint32)ceil(data->n/5.0);
nbytes = fwrite(data->g, sizeof(cmph_uint8)*sizeg, (size_t)1, fd);
if (nbytes == 0 && ferror(fd)) {
fprintf(stderr, "ERROR: %s\n", strerror(errno));
return 0;
}
#ifdef DEBUG
fprintf(stderr, "G: ");
for (i = 0; i < data->n; ++i) fprintf(stderr, "%u ", GETVALUE(data->g, i));
fprintf(stderr, "\n");
#endif
return 1;
}
void bdz_ph_load(FILE *f, cmph_t *mphf)
{
char *buf = NULL;
cmph_uint32 buflen;
cmph_uint32 sizeg = 0;
register size_t nbytes;
bdz_ph_data_t *bdz_ph = (bdz_ph_data_t *)malloc(sizeof(bdz_ph_data_t));
DEBUGP("Loading bdz_ph mphf\n");
mphf->data = bdz_ph;
nbytes = fread(&buflen, sizeof(cmph_uint32), (size_t)1, f);
DEBUGP("Hash state has %u bytes\n", buflen);
buf = (char *)malloc((size_t)buflen);
nbytes = fread(buf, (size_t)buflen, (size_t)1, f);
bdz_ph->hl = hash_state_load(buf, buflen);
free(buf);
DEBUGP("Reading m and n\n");
nbytes = fread(&(bdz_ph->n), sizeof(cmph_uint32), (size_t)1, f);
nbytes = fread(&(bdz_ph->m), sizeof(cmph_uint32), (size_t)1, f);
nbytes = fread(&(bdz_ph->r), sizeof(cmph_uint32), (size_t)1, f);
sizeg = (cmph_uint32)ceil(bdz_ph->n/5.0);
bdz_ph->g = (cmph_uint8 *)calloc((size_t)sizeg, sizeof(cmph_uint8));
nbytes = fread(bdz_ph->g, sizeg*sizeof(cmph_uint8), (size_t)1, f);
if (nbytes == 0 && ferror(f)) {
fprintf(stderr, "ERROR: %s\n", strerror(errno));
}
return;
}
cmph_uint32 bdz_ph_search(cmph_t *mphf, const char *key, cmph_uint32 keylen)
{
register bdz_ph_data_t *bdz_ph = mphf->data;
cmph_uint32 hl[3];
register cmph_uint8 byte0, byte1, byte2;
register cmph_uint32 vertex;
hash_vector(bdz_ph->hl, key, keylen,hl);
hl[0] = hl[0] % bdz_ph->r;
hl[1] = hl[1] % bdz_ph->r + bdz_ph->r;
hl[2] = hl[2] % bdz_ph->r + (bdz_ph->r << 1);
byte0 = bdz_ph->g[hl[0]/5];
byte1 = bdz_ph->g[hl[1]/5];
byte2 = bdz_ph->g[hl[2]/5];
byte0 = lookup_table[hl[0]%5U][byte0];
byte1 = lookup_table[hl[1]%5U][byte1];
byte2 = lookup_table[hl[2]%5U][byte2];
vertex = hl[(byte0 + byte1 + byte2)%3];
return vertex;
}
void bdz_ph_destroy(cmph_t *mphf)
{
bdz_ph_data_t *data = (bdz_ph_data_t *)mphf->data;
free(data->g);
hash_state_destroy(data->hl);
free(data);
free(mphf);
}
/** \fn void bdz_ph_pack(cmph_t *mphf, void *packed_mphf);
* \brief Support the ability to pack a perfect hash function into a preallocated contiguous memory space pointed by packed_mphf.
* \param mphf pointer to the resulting mphf
* \param packed_mphf pointer to the contiguous memory area used to store the resulting mphf. The size of packed_mphf must be at least cmph_packed_size()
*/
void bdz_ph_pack(cmph_t *mphf, void *packed_mphf)
{
bdz_ph_data_t *data = (bdz_ph_data_t *)mphf->data;
cmph_uint8 * ptr = packed_mphf;
cmph_uint32 sizeg;
// packing hl type
CMPH_HASH hl_type = hash_get_type(data->hl);
*((cmph_uint32 *) ptr) = hl_type;
ptr += sizeof(cmph_uint32);
// packing hl
hash_state_pack(data->hl, ptr);
ptr += hash_state_packed_size(hl_type);
// packing r
*((cmph_uint32 *) ptr) = data->r;
ptr += sizeof(data->r);
// packing g
sizeg = (cmph_uint32)ceil(data->n/5.0);
memcpy(ptr, data->g, sizeof(cmph_uint8)*sizeg);
}
/** \fn cmph_uint32 bdz_ph_packed_size(cmph_t *mphf);
* \brief Return the amount of space needed to pack mphf.
* \param mphf pointer to a mphf
* \return the size of the packed function or zero for failures
*/
cmph_uint32 bdz_ph_packed_size(cmph_t *mphf)
{
bdz_ph_data_t *data = (bdz_ph_data_t *)mphf->data;
CMPH_HASH hl_type = hash_get_type(data->hl);
cmph_uint32 sizeg = (cmph_uint32)ceil(data->n/5.0);
return (cmph_uint32) (sizeof(CMPH_ALGO) + hash_state_packed_size(hl_type) + 2*sizeof(cmph_uint32) + sizeof(cmph_uint8)*sizeg);
}
/** cmph_uint32 bdz_ph_search(void *packed_mphf, const char *key, cmph_uint32 keylen);
* \brief Use the packed mphf to do a search.
* \param packed_mphf pointer to the packed mphf
* \param key key to be hashed
* \param keylen key legth in bytes
* \return The mphf value
*/
cmph_uint32 bdz_ph_search_packed(void *packed_mphf, const char *key, cmph_uint32 keylen)
{
register CMPH_HASH hl_type = *(cmph_uint32 *)packed_mphf;
register cmph_uint8 *hl_ptr = (cmph_uint8 *)(packed_mphf) + 4;
register cmph_uint8 * ptr = hl_ptr + hash_state_packed_size(hl_type);
register cmph_uint32 r = *((cmph_uint32*) ptr);
register cmph_uint8 * g = ptr + 4;
cmph_uint32 hl[3];
register cmph_uint8 byte0, byte1, byte2;
register cmph_uint32 vertex;
hash_vector_packed(hl_ptr, hl_type, key, keylen, hl);
hl[0] = hl[0] % r;
hl[1] = hl[1] % r + r;
hl[2] = hl[2] % r + (r << 1);
byte0 = g[hl[0]/5];
byte1 = g[hl[1]/5];
byte2 = g[hl[2]/5];
byte0 = lookup_table[hl[0]%5][byte0];
byte1 = lookup_table[hl[1]%5][byte1];
byte2 = lookup_table[hl[2]%5][byte2];
vertex = hl[(byte0 + byte1 + byte2)%3];
return vertex;
}

View File

@ -0,0 +1,42 @@
#ifndef __CMPH_BDZ_PH_H__
#define __CMPH_BDZ_PH_H__
#include "cmph.h"
typedef struct __bdz_ph_data_t bdz_ph_data_t;
typedef struct __bdz_ph_config_data_t bdz_ph_config_data_t;
bdz_ph_config_data_t *bdz_ph_config_new(void);
void bdz_ph_config_set_hashfuncs(cmph_config_t *mph, CMPH_HASH *hashfuncs);
void bdz_ph_config_destroy(cmph_config_t *mph);
cmph_t *bdz_ph_new(cmph_config_t *mph, double c);
void bdz_ph_load(FILE *f, cmph_t *mphf);
int bdz_ph_dump(cmph_t *mphf, FILE *f);
void bdz_ph_destroy(cmph_t *mphf);
cmph_uint32 bdz_ph_search(cmph_t *mphf, const char *key, cmph_uint32 keylen);
/** \fn void bdz_ph_pack(cmph_t *mphf, void *packed_mphf);
* \brief Support the ability to pack a perfect hash function into a preallocated contiguous memory space pointed by packed_mphf.
* \param mphf pointer to the resulting mphf
* \param packed_mphf pointer to the contiguous memory area used to store the resulting mphf. The size of packed_mphf must be at least cmph_packed_size()
*/
void bdz_ph_pack(cmph_t *mphf, void *packed_mphf);
/** \fn cmph_uint32 bdz_ph_packed_size(cmph_t *mphf);
* \brief Return the amount of space needed to pack mphf.
* \param mphf pointer to a mphf
* \return the size of the packed function or zero for failures
*/
cmph_uint32 bdz_ph_packed_size(cmph_t *mphf);
/** cmph_uint32 bdz_ph_search(void *packed_mphf, const char *key, cmph_uint32 keylen);
* \brief Use the packed mphf to do a search.
* \param packed_mphf pointer to the packed mphf
* \param key key to be hashed
* \param keylen key legth in bytes
* \return The mphf value
*/
cmph_uint32 bdz_ph_search_packed(void *packed_mphf, const char *key, cmph_uint32 keylen);
#endif

View File

@ -0,0 +1,36 @@
#ifndef __CMPH_BDZ_STRUCTS_H__
#define __CMPH_BDZ_STRUCTS_H__
#include "hash_state.h"
struct __bdz_data_t
{
cmph_uint32 m; //edges (words) count
cmph_uint32 n; //vertex count
cmph_uint32 r; //partition vertex count
cmph_uint8 *g;
hash_state_t *hl; // linear hashing
cmph_uint32 k; //kth index in ranktable, $k = log_2(n=3r)/\varepsilon$
cmph_uint8 b; // number of bits of k
cmph_uint32 ranktablesize; //number of entries in ranktable, $n/k +1$
cmph_uint32 *ranktable; // rank table
};
struct __bdz_config_data_t
{
cmph_uint32 m; //edges (words) count
cmph_uint32 n; //vertex count
cmph_uint32 r; //partition vertex count
cmph_uint8 *g;
hash_state_t *hl; // linear hashing
cmph_uint32 k; //kth index in ranktable, $k = log_2(n=3r)/\varepsilon$
cmph_uint8 b; // number of bits of k
cmph_uint32 ranktablesize; //number of entries in ranktable, $n/k +1$
cmph_uint32 *ranktable; // rank table
CMPH_HASH hashfunc;
};
#endif

View File

@ -0,0 +1,26 @@
#ifndef __CMPH_BDZ_STRUCTS_PH_H__
#define __CMPH_BDZ_STRUCTS_PH_H__
#include "hash_state.h"
struct __bdz_ph_data_t
{
cmph_uint32 m; //edges (words) count
cmph_uint32 n; //vertex count
cmph_uint32 r; //partition vertex count
cmph_uint8 *g;
hash_state_t *hl; // linear hashing
};
struct __bdz_ph_config_data_t
{
CMPH_HASH hashfunc;
cmph_uint32 m; //edges (words) count
cmph_uint32 n; //vertex count
cmph_uint32 r; //partition vertex count
cmph_uint8 *g;
hash_state_t *hl; // linear hashing
};
#endif

179
girepository/cmph/bitbool.h Normal file
View File

@ -0,0 +1,179 @@
#ifndef _CMPH_BITBOOL_H__
#define _CMPH_BITBOOL_H__
#include "cmph_types.h"
static const cmph_uint8 bitmask[] = { 1, 1 << 1, 1 << 2, 1 << 3, 1 << 4, 1 << 5, 1 << 6, 1 << 7 };
static const cmph_uint32 bitmask32[] = { 1, 1 << 1, 1 << 2, 1 << 3, 1 << 4, 1 << 5, 1 << 6, 1 << 7,
1 << 8, 1 << 9, 1 << 10, 1 << 11, 1 << 12, 1 << 13, 1 << 14, 1 << 15,
1 << 16, 1 << 17, 1 << 18, 1 << 19, 1 << 20, 1 << 21, 1 << 22, 1 << 23,
1 << 24, 1 << 25, 1 << 26, 1 << 27, 1 << 28, 1 << 29, 1 << 30, 1U << 31
};
static const cmph_uint8 valuemask[] = { 0xfc, 0xf3, 0xcf, 0x3f};
/** \def GETBIT(array, i)
* \brief get the value of an 1-bit integer stored in an array.
* \param array to get 1-bit integer values from
* \param i is the index in array to get the 1-bit integer value from
*
* GETBIT(array, i) is a macro that gets the value of an 1-bit integer stored in array.
*/
#define GETBIT(array, i) ((array[i >> 3] & bitmask[i & 0x00000007]) >> (i & 0x00000007))
/** \def SETBIT(array, i)
* \brief set 1 to an 1-bit integer stored in an array.
* \param array to store 1-bit integer values
* \param i is the index in array to set the the bit to 1
*
* SETBIT(array, i) is a macro that sets 1 to an 1-bit integer stored in an array.
*/
#define SETBIT(array, i) (array[i >> 3] |= bitmask[i & 0x00000007])
/** \def UNSETBIT(array, i)
* \brief set 0 to an 1-bit integer stored in an array.
* \param array to store 1-bit integer values
* \param i is the index in array to set the the bit to 0
*
* UNSETBIT(array, i) is a macro that sets 0 to an 1-bit integer stored in an array.
*/
#define UNSETBIT(array, i) (array[i >> 3] ^= ((bitmask[i & 0x00000007])))
//#define GETBIT(array, i) (array[(i) / 8] & bitmask[(i) % 8])
//#define SETBIT(array, i) (array[(i) / 8] |= bitmask[(i) % 8])
//#define UNSETBIT(array, i) (array[(i) / 8] ^= ((bitmask[(i) % 8])))
/** \def SETVALUE1(array, i, v)
* \brief set a value for a 2-bit integer stored in an array initialized with 1s.
* \param array to store 2-bit integer values
* \param i is the index in array to set the value v
* \param v is the value to be set
*
* SETVALUE1(array, i, v) is a macro that set a value for a 2-bit integer stored in an array.
* The array should be initialized with all bits set to 1. For example:
* memset(array, 0xff, arraySize);
*/
#define SETVALUE1(array, i, v) (array[i >> 2] &= (cmph_uint8)((v << ((i & 0x00000003) << 1)) | valuemask[i & 0x00000003]))
/** \def SETVALUE0(array, i, v)
* \brief set a value for a 2-bit integer stored in an array initialized with 0s.
* \param array to store 2-bit integer values
* \param i is the index in array to set the value v
* \param v is the value to be set
*
* SETVALUE0(array, i, v) is a macro that set a value for a 2-bit integer stored in an array.
* The array should be initialized with all bits set to 0. For example:
* memset(array, 0, arraySize);
*/
#define SETVALUE0(array, i, v) (array[i >> 2] |= (cmph_uint8)(v << ((i & 0x00000003) << 1)))
/** \def GETVALUE(array, i)
* \brief get a value for a 2-bit integer stored in an array.
* \param array to get 2-bit integer values from
* \param i is the index in array to get the value from
*
* GETVALUE(array, i) is a macro that get a value for a 2-bit integer stored in an array.
*/
#define GETVALUE(array, i) ((cmph_uint8)((array[i >> 2] >> ((i & 0x00000003U) << 1U)) & 0x00000003U))
/** \def SETBIT32(array, i)
* \brief set 1 to an 1-bit integer stored in an array of 32-bit words.
* \param array to store 1-bit integer values. The entries are 32-bit words.
* \param i is the index in array to set the the bit to 1
*
* SETBIT32(array, i) is a macro that sets 1 to an 1-bit integer stored in an array of 32-bit words.
*/
#define SETBIT32(array, i) (array[i >> 5] |= bitmask32[i & 0x0000001f])
/** \def GETBIT32(array, i)
* \brief get the value of an 1-bit integer stored in an array of 32-bit words.
* \param array to get 1-bit integer values from. The entries are 32-bit words.
* \param i is the index in array to get the 1-bit integer value from
*
* GETBIT32(array, i) is a macro that gets the value of an 1-bit integer stored in an array of 32-bit words.
*/
#define GETBIT32(array, i) (array[i >> 5] & bitmask32[i & 0x0000001f])
/** \def UNSETBIT32(array, i)
* \brief set 0 to an 1-bit integer stored in an array of 32-bit words.
* \param array to store 1-bit integer values. The entries ar 32-bit words
* \param i is the index in array to set the the bit to 0
*
* UNSETBIT32(array, i) is a macro that sets 0 to an 1-bit integer stored in an array of 32-bit words.
*/
#define UNSETBIT32(array, i) (array[i >> 5] ^= ((bitmask32[i & 0x0000001f])))
#define BITS_TABLE_SIZE(n, bits_length) ((n * bits_length + 31) >> 5)
static inline void set_bits_value(cmph_uint32 * bits_table, cmph_uint32 index, cmph_uint32 bits_string,
cmph_uint32 string_length, cmph_uint32 string_mask)
{
register cmph_uint32 bit_idx = index * string_length;
register cmph_uint32 word_idx = bit_idx >> 5;
register cmph_uint32 shift1 = bit_idx & 0x0000001f;
register cmph_uint32 shift2 = 32 - shift1;
bits_table[word_idx] &= ~((string_mask) << shift1);
bits_table[word_idx] |= bits_string << shift1;
if(shift2 < string_length)
{
bits_table[word_idx+1] &= ~((string_mask) >> shift2);
bits_table[word_idx+1] |= bits_string >> shift2;
};
};
static inline cmph_uint32 get_bits_value(cmph_uint32 * bits_table,cmph_uint32 index, cmph_uint32 string_length, cmph_uint32 string_mask)
{
register cmph_uint32 bit_idx = index * string_length;
register cmph_uint32 word_idx = bit_idx >> 5;
register cmph_uint32 shift1 = bit_idx & 0x0000001f;
register cmph_uint32 shift2 = 32-shift1;
register cmph_uint32 bits_string;
bits_string = (bits_table[word_idx] >> shift1) & string_mask;
if(shift2 < string_length)
bits_string |= (bits_table[word_idx+1] << shift2) & string_mask;
return bits_string;
};
static inline void set_bits_at_pos(cmph_uint32 * bits_table, cmph_uint32 pos, cmph_uint32 bits_string, cmph_uint32 string_length)
{
register cmph_uint32 word_idx = pos >> 5;
register cmph_uint32 shift1 = pos & 0x0000001f;
register cmph_uint32 shift2 = 32-shift1;
register cmph_uint32 string_mask = (1U << string_length) - 1;
bits_table[word_idx] &= ~((string_mask) << shift1);
bits_table[word_idx] |= bits_string << shift1;
if(shift2 < string_length)
{
bits_table[word_idx+1] &= ~((string_mask) >> shift2);
bits_table[word_idx+1] |= bits_string >> shift2;
}
};
static inline cmph_uint32 get_bits_at_pos(cmph_uint32 * bits_table,cmph_uint32 pos,cmph_uint32 string_length)
{
register cmph_uint32 word_idx = pos >> 5;
register cmph_uint32 shift1 = pos & 0x0000001f;
register cmph_uint32 shift2 = 32 - shift1;
register cmph_uint32 string_mask = (1U << string_length) - 1;
register cmph_uint32 bits_string;
bits_string = (bits_table[word_idx] >> shift1) & string_mask;
if(shift2 < string_length)
bits_string |= (bits_table[word_idx+1] << shift2) & string_mask;
return bits_string;
}
#endif

638
girepository/cmph/bmz.c Normal file
View File

@ -0,0 +1,638 @@
#include "graph.h"
#include "bmz.h"
#include "cmph_structs.h"
#include "bmz_structs.h"
#include "hash.h"
#include "vqueue.h"
#include "bitbool.h"
#include <math.h>
#include <stdlib.h>
#include <stdio.h>
#include <assert.h>
#include <string.h>
#include <errno.h>
//#define DEBUG
#include "debug.h"
static int bmz_gen_edges(cmph_config_t *mph);
static cmph_uint8 bmz_traverse_critical_nodes(bmz_config_data_t *bmz, cmph_uint32 v, cmph_uint32 * biggest_g_value, cmph_uint32 * biggest_edge_value, cmph_uint8 * used_edges, cmph_uint8 * visited);
static cmph_uint8 bmz_traverse_critical_nodes_heuristic(bmz_config_data_t *bmz, cmph_uint32 v, cmph_uint32 * biggest_g_value, cmph_uint32 * biggest_edge_value, cmph_uint8 * used_edges, cmph_uint8 * visited);
static void bmz_traverse_non_critical_nodes(bmz_config_data_t *bmz, cmph_uint8 * used_edges, cmph_uint8 * visited);
bmz_config_data_t *bmz_config_new(void)
{
bmz_config_data_t *bmz = NULL;
bmz = (bmz_config_data_t *)malloc(sizeof(bmz_config_data_t));
assert(bmz);
memset(bmz, 0, sizeof(bmz_config_data_t));
bmz->hashfuncs[0] = CMPH_HASH_JENKINS;
bmz->hashfuncs[1] = CMPH_HASH_JENKINS;
bmz->g = NULL;
bmz->graph = NULL;
bmz->hashes = NULL;
return bmz;
}
void bmz_config_destroy(cmph_config_t *mph)
{
bmz_config_data_t *data = (bmz_config_data_t *)mph->data;
DEBUGP("Destroying algorithm dependent data\n");
free(data);
}
void bmz_config_set_hashfuncs(cmph_config_t *mph, CMPH_HASH *hashfuncs)
{
bmz_config_data_t *bmz = (bmz_config_data_t *)mph->data;
CMPH_HASH *hashptr = hashfuncs;
cmph_uint32 i = 0;
while(*hashptr != CMPH_HASH_COUNT)
{
if (i >= 2) break; //bmz only uses two hash functions
bmz->hashfuncs[i] = *hashptr;
++i, ++hashptr;
}
}
cmph_t *bmz_new(cmph_config_t *mph, double c)
{
cmph_t *mphf = NULL;
bmz_data_t *bmzf = NULL;
cmph_uint32 i;
cmph_uint32 iterations;
cmph_uint32 iterations_map = 20;
cmph_uint8 *used_edges = NULL;
cmph_uint8 restart_mapping = 0;
cmph_uint8 * visited = NULL;
bmz_config_data_t *bmz = (bmz_config_data_t *)mph->data;
if (c == 0) c = 1.15; // validating restrictions over parameter c.
DEBUGP("c: %f\n", c);
bmz->m = mph->key_source->nkeys;
bmz->n = (cmph_uint32)ceil(c * mph->key_source->nkeys);
DEBUGP("m (edges): %u n (vertices): %u c: %f\n", bmz->m, bmz->n, c);
bmz->graph = graph_new(bmz->n, bmz->m);
DEBUGP("Created graph\n");
bmz->hashes = (hash_state_t **)malloc(sizeof(hash_state_t *)*3);
for(i = 0; i < 3; ++i) bmz->hashes[i] = NULL;
do
{
// Mapping step
cmph_uint32 biggest_g_value = 0;
cmph_uint32 biggest_edge_value = 1;
iterations = 100;
if (mph->verbosity)
{
fprintf(stderr, "Entering mapping step for mph creation of %u keys with graph sized %u\n", bmz->m, bmz->n);
}
while(1)
{
int ok;
DEBUGP("hash function 1\n");
bmz->hashes[0] = hash_state_new(bmz->hashfuncs[0], bmz->n);
DEBUGP("hash function 2\n");
bmz->hashes[1] = hash_state_new(bmz->hashfuncs[1], bmz->n);
DEBUGP("Generating edges\n");
ok = bmz_gen_edges(mph);
if (!ok)
{
--iterations;
hash_state_destroy(bmz->hashes[0]);
bmz->hashes[0] = NULL;
hash_state_destroy(bmz->hashes[1]);
bmz->hashes[1] = NULL;
DEBUGP("%u iterations remaining\n", iterations);
if (mph->verbosity)
{
fprintf(stderr, "simple graph creation failure - %u iterations remaining\n", iterations);
}
if (iterations == 0) break;
}
else break;
}
if (iterations == 0)
{
graph_destroy(bmz->graph);
return NULL;
}
// Ordering step
if (mph->verbosity)
{
fprintf(stderr, "Starting ordering step\n");
}
graph_obtain_critical_nodes(bmz->graph);
// Searching step
if (mph->verbosity)
{
fprintf(stderr, "Starting Searching step.\n");
fprintf(stderr, "\tTraversing critical vertices.\n");
}
DEBUGP("Searching step\n");
visited = (cmph_uint8 *)malloc((size_t)bmz->n/8 + 1);
memset(visited, 0, (size_t)bmz->n/8 + 1);
used_edges = (cmph_uint8 *)malloc((size_t)bmz->m/8 + 1);
memset(used_edges, 0, (size_t)bmz->m/8 + 1);
free(bmz->g);
bmz->g = (cmph_uint32 *)calloc((size_t)bmz->n, sizeof(cmph_uint32));
assert(bmz->g);
for (i = 0; i < bmz->n; ++i) // critical nodes
{
if (graph_node_is_critical(bmz->graph, i) && (!GETBIT(visited,i)))
{
if(c > 1.14) restart_mapping = bmz_traverse_critical_nodes(bmz, i, &biggest_g_value, &biggest_edge_value, used_edges, visited);
else restart_mapping = bmz_traverse_critical_nodes_heuristic(bmz, i, &biggest_g_value, &biggest_edge_value, used_edges, visited);
if(restart_mapping) break;
}
}
if(!restart_mapping)
{
if (mph->verbosity)
{
fprintf(stderr, "\tTraversing non critical vertices.\n");
}
bmz_traverse_non_critical_nodes(bmz, used_edges, visited); // non_critical_nodes
}
else
{
iterations_map--;
if (mph->verbosity) fprintf(stderr, "Restarting mapping step. %u iterations remaining.\n", iterations_map);
}
free(used_edges);
free(visited);
}while(restart_mapping && iterations_map > 0);
graph_destroy(bmz->graph);
bmz->graph = NULL;
if (iterations_map == 0)
{
return NULL;
}
mphf = (cmph_t *)malloc(sizeof(cmph_t));
mphf->algo = mph->algo;
bmzf = (bmz_data_t *)malloc(sizeof(bmz_data_t));
bmzf->g = bmz->g;
bmz->g = NULL; //transfer memory ownership
bmzf->hashes = bmz->hashes;
bmz->hashes = NULL; //transfer memory ownership
bmzf->n = bmz->n;
bmzf->m = bmz->m;
mphf->data = bmzf;
mphf->size = bmz->m;
DEBUGP("Successfully generated minimal perfect hash\n");
if (mph->verbosity)
{
fprintf(stderr, "Successfully generated minimal perfect hash function\n");
}
return mphf;
}
static cmph_uint8 bmz_traverse_critical_nodes(bmz_config_data_t *bmz, cmph_uint32 v, cmph_uint32 * biggest_g_value, cmph_uint32 * biggest_edge_value, cmph_uint8 * used_edges, cmph_uint8 * visited)
{
cmph_uint32 next_g;
cmph_uint32 u; /* Auxiliary vertex */
cmph_uint32 lav; /* lookahead vertex */
cmph_uint8 collision;
vqueue_t * q = vqueue_new((cmph_uint32)(graph_ncritical_nodes(bmz->graph)) + 1);
graph_iterator_t it, it1;
DEBUGP("Labelling critical vertices\n");
bmz->g[v] = (cmph_uint32)ceil ((double)(*biggest_edge_value)/2) - 1;
SETBIT(visited, v);
next_g = (cmph_uint32)floor((double)(*biggest_edge_value/2)); /* next_g is incremented in the do..while statement*/
vqueue_insert(q, v);
while(!vqueue_is_empty(q))
{
v = vqueue_remove(q);
it = graph_neighbors_it(bmz->graph, v);
while ((u = graph_next_neighbor(bmz->graph, &it)) != GRAPH_NO_NEIGHBOR)
{
if (graph_node_is_critical(bmz->graph, u) && (!GETBIT(visited,u)))
{
collision = 1;
while(collision) // lookahead to resolve collisions
{
next_g = *biggest_g_value + 1;
it1 = graph_neighbors_it(bmz->graph, u);
collision = 0;
while((lav = graph_next_neighbor(bmz->graph, &it1)) != GRAPH_NO_NEIGHBOR)
{
if (graph_node_is_critical(bmz->graph, lav) && GETBIT(visited,lav))
{
if(next_g + bmz->g[lav] >= bmz->m)
{
vqueue_destroy(q);
return 1; // restart mapping step.
}
if (GETBIT(used_edges, (next_g + bmz->g[lav])))
{
collision = 1;
break;
}
}
}
if (next_g > *biggest_g_value) *biggest_g_value = next_g;
}
// Marking used edges...
it1 = graph_neighbors_it(bmz->graph, u);
while((lav = graph_next_neighbor(bmz->graph, &it1)) != GRAPH_NO_NEIGHBOR)
{
if (graph_node_is_critical(bmz->graph, lav) && GETBIT(visited, lav))
{
SETBIT(used_edges,(next_g + bmz->g[lav]));
if(next_g + bmz->g[lav] > *biggest_edge_value) *biggest_edge_value = next_g + bmz->g[lav];
}
}
bmz->g[u] = next_g; // Labelling vertex u.
SETBIT(visited,u);
vqueue_insert(q, u);
}
}
}
vqueue_destroy(q);
return 0;
}
static cmph_uint8 bmz_traverse_critical_nodes_heuristic(bmz_config_data_t *bmz, cmph_uint32 v, cmph_uint32 * biggest_g_value, cmph_uint32 * biggest_edge_value, cmph_uint8 * used_edges, cmph_uint8 * visited)
{
cmph_uint32 next_g;
cmph_uint32 u; /* Auxiliary vertex */
cmph_uint32 lav; /* lookahead vertex */
cmph_uint8 collision;
cmph_uint32 * unused_g_values = NULL;
cmph_uint32 unused_g_values_capacity = 0;
cmph_uint32 nunused_g_values = 0;
vqueue_t * q = vqueue_new((cmph_uint32)(0.5*graph_ncritical_nodes(bmz->graph))+1);
graph_iterator_t it, it1;
DEBUGP("Labelling critical vertices\n");
bmz->g[v] = (cmph_uint32)ceil ((double)(*biggest_edge_value)/2) - 1;
SETBIT(visited, v);
next_g = (cmph_uint32)floor((double)(*biggest_edge_value/2)); /* next_g is incremented in the do..while statement*/
vqueue_insert(q, v);
while(!vqueue_is_empty(q))
{
v = vqueue_remove(q);
it = graph_neighbors_it(bmz->graph, v);
while ((u = graph_next_neighbor(bmz->graph, &it)) != GRAPH_NO_NEIGHBOR)
{
if (graph_node_is_critical(bmz->graph, u) && (!GETBIT(visited,u)))
{
cmph_uint32 next_g_index = 0;
collision = 1;
while(collision) // lookahead to resolve collisions
{
if (next_g_index < nunused_g_values)
{
next_g = unused_g_values[next_g_index++];
}
else
{
next_g = *biggest_g_value + 1;
next_g_index = UINT_MAX;
}
it1 = graph_neighbors_it(bmz->graph, u);
collision = 0;
while((lav = graph_next_neighbor(bmz->graph, &it1)) != GRAPH_NO_NEIGHBOR)
{
if (graph_node_is_critical(bmz->graph, lav) && GETBIT(visited,lav))
{
if(next_g + bmz->g[lav] >= bmz->m)
{
vqueue_destroy(q);
free(unused_g_values);
return 1; // restart mapping step.
}
if (GETBIT(used_edges, (next_g + bmz->g[lav])))
{
collision = 1;
break;
}
}
}
if(collision && (next_g > *biggest_g_value)) // saving the current g value stored in next_g.
{
if(nunused_g_values == unused_g_values_capacity)
{
unused_g_values = (cmph_uint32 *)realloc(unused_g_values, (unused_g_values_capacity + BUFSIZ)*sizeof(cmph_uint32));
unused_g_values_capacity += BUFSIZ;
}
unused_g_values[nunused_g_values++] = next_g;
}
if (next_g > *biggest_g_value) *biggest_g_value = next_g;
}
next_g_index--;
if (next_g_index < nunused_g_values) unused_g_values[next_g_index] = unused_g_values[--nunused_g_values];
// Marking used edges...
it1 = graph_neighbors_it(bmz->graph, u);
while((lav = graph_next_neighbor(bmz->graph, &it1)) != GRAPH_NO_NEIGHBOR)
{
if (graph_node_is_critical(bmz->graph, lav) && GETBIT(visited, lav))
{
SETBIT(used_edges,(next_g + bmz->g[lav]));
if(next_g + bmz->g[lav] > *biggest_edge_value) *biggest_edge_value = next_g + bmz->g[lav];
}
}
bmz->g[u] = next_g; // Labelling vertex u.
SETBIT(visited, u);
vqueue_insert(q, u);
}
}
}
vqueue_destroy(q);
free(unused_g_values);
return 0;
}
static cmph_uint32 next_unused_edge(bmz_config_data_t *bmz, cmph_uint8 * used_edges, cmph_uint32 unused_edge_index)
{
while(1)
{
assert(unused_edge_index < bmz->m);
if(GETBIT(used_edges, unused_edge_index)) unused_edge_index ++;
else break;
}
return unused_edge_index;
}
static void bmz_traverse(bmz_config_data_t *bmz, cmph_uint8 * used_edges, cmph_uint32 v, cmph_uint32 * unused_edge_index, cmph_uint8 * visited)
{
graph_iterator_t it = graph_neighbors_it(bmz->graph, v);
cmph_uint32 neighbor = 0;
while((neighbor = graph_next_neighbor(bmz->graph, &it)) != GRAPH_NO_NEIGHBOR)
{
if(GETBIT(visited,neighbor)) continue;
//DEBUGP("Visiting neighbor %u\n", neighbor);
*unused_edge_index = next_unused_edge(bmz, used_edges, *unused_edge_index);
bmz->g[neighbor] = *unused_edge_index - bmz->g[v];
//if (bmz->g[neighbor] >= bmz->m) bmz->g[neighbor] += bmz->m;
SETBIT(visited, neighbor);
(*unused_edge_index)++;
bmz_traverse(bmz, used_edges, neighbor, unused_edge_index, visited);
}
}
static void bmz_traverse_non_critical_nodes(bmz_config_data_t *bmz, cmph_uint8 * used_edges, cmph_uint8 * visited)
{
cmph_uint32 i, v1, v2, unused_edge_index = 0;
DEBUGP("Labelling non critical vertices\n");
for(i = 0; i < bmz->m; i++)
{
v1 = graph_vertex_id(bmz->graph, i, 0);
v2 = graph_vertex_id(bmz->graph, i, 1);
if((GETBIT(visited,v1) && GETBIT(visited,v2)) || (!GETBIT(visited,v1) && !GETBIT(visited,v2))) continue;
if(GETBIT(visited,v1)) bmz_traverse(bmz, used_edges, v1, &unused_edge_index, visited);
else bmz_traverse(bmz, used_edges, v2, &unused_edge_index, visited);
}
for(i = 0; i < bmz->n; i++)
{
if(!GETBIT(visited,i))
{
bmz->g[i] = 0;
SETBIT(visited, i);
bmz_traverse(bmz, used_edges, i, &unused_edge_index, visited);
}
}
}
static int bmz_gen_edges(cmph_config_t *mph)
{
cmph_uint32 e;
bmz_config_data_t *bmz = (bmz_config_data_t *)mph->data;
cmph_uint8 multiple_edges = 0;
DEBUGP("Generating edges for %u vertices\n", bmz->n);
graph_clear_edges(bmz->graph);
mph->key_source->rewind(mph->key_source->data);
for (e = 0; e < mph->key_source->nkeys; ++e)
{
cmph_uint32 h1, h2;
cmph_uint32 keylen;
char *key = NULL;
mph->key_source->read(mph->key_source->data, &key, &keylen);
// if (key == NULL)fprintf(stderr, "key = %s -- read BMZ\n", key);
h1 = hash(bmz->hashes[0], key, keylen) % bmz->n;
h2 = hash(bmz->hashes[1], key, keylen) % bmz->n;
if (h1 == h2) if (++h2 >= bmz->n) h2 = 0;
if (h1 == h2)
{
if (mph->verbosity) fprintf(stderr, "Self loop for key %u\n", e);
mph->key_source->dispose(mph->key_source->data, key, keylen);
return 0;
}
//DEBUGP("Adding edge: %u -> %u for key %s\n", h1, h2, key);
mph->key_source->dispose(mph->key_source->data, key, keylen);
// fprintf(stderr, "key = %s -- dispose BMZ\n", key);
multiple_edges = graph_contains_edge(bmz->graph, h1, h2);
if (mph->verbosity && multiple_edges) fprintf(stderr, "A non simple graph was generated\n");
if (multiple_edges) return 0; // checking multiple edge restriction.
graph_add_edge(bmz->graph, h1, h2);
}
return !multiple_edges;
}
int bmz_dump(cmph_t *mphf, FILE *fd)
{
char *buf = NULL;
cmph_uint32 buflen;
cmph_uint32 two = 2; //number of hash functions
bmz_data_t *data = (bmz_data_t *)mphf->data;
register size_t nbytes;
#ifdef DEBUG
cmph_uint32 i;
#endif
__cmph_dump(mphf, fd);
nbytes = fwrite(&two, sizeof(cmph_uint32), (size_t)1, fd);
hash_state_dump(data->hashes[0], &buf, &buflen);
DEBUGP("Dumping hash state with %u bytes to disk\n", buflen);
nbytes = fwrite(&buflen, sizeof(cmph_uint32), (size_t)1, fd);
nbytes = fwrite(buf, (size_t)buflen, (size_t)1, fd);
free(buf);
hash_state_dump(data->hashes[1], &buf, &buflen);
DEBUGP("Dumping hash state with %u bytes to disk\n", buflen);
nbytes = fwrite(&buflen, sizeof(cmph_uint32), (size_t)1, fd);
nbytes = fwrite(buf, (size_t)buflen, (size_t)1, fd);
free(buf);
nbytes = fwrite(&(data->n), sizeof(cmph_uint32), (size_t)1, fd);
nbytes = fwrite(&(data->m), sizeof(cmph_uint32), (size_t)1, fd);
nbytes = fwrite(data->g, sizeof(cmph_uint32)*(data->n), (size_t)1, fd);
if (nbytes == 0 && ferror(fd)) {
fprintf(stderr, "ERROR: %s\n", strerror(errno));
return 0;
}
#ifdef DEBUG
fprintf(stderr, "G: ");
for (i = 0; i < data->n; ++i) fprintf(stderr, "%u ", data->g[i]);
fprintf(stderr, "\n");
#endif
return 1;
}
void bmz_load(FILE *f, cmph_t *mphf)
{
cmph_uint32 nhashes;
char *buf = NULL;
cmph_uint32 buflen;
cmph_uint32 i;
bmz_data_t *bmz = (bmz_data_t *)malloc(sizeof(bmz_data_t));
register size_t nbytes;
DEBUGP("Loading bmz mphf\n");
mphf->data = bmz;
nbytes = fread(&nhashes, sizeof(cmph_uint32), (size_t)1, f);
bmz->hashes = (hash_state_t **)malloc(sizeof(hash_state_t *)*(nhashes + 1));
bmz->hashes[nhashes] = NULL;
DEBUGP("Reading %u hashes\n", nhashes);
for (i = 0; i < nhashes; ++i)
{
hash_state_t *state = NULL;
nbytes = fread(&buflen, sizeof(cmph_uint32), (size_t)1, f);
DEBUGP("Hash state has %u bytes\n", buflen);
buf = (char *)malloc((size_t)buflen);
nbytes = fread(buf, (size_t)buflen, (size_t)1, f);
state = hash_state_load(buf, buflen);
bmz->hashes[i] = state;
free(buf);
}
DEBUGP("Reading m and n\n");
nbytes = fread(&(bmz->n), sizeof(cmph_uint32), (size_t)1, f);
nbytes = fread(&(bmz->m), sizeof(cmph_uint32), (size_t)1, f);
bmz->g = (cmph_uint32 *)malloc(sizeof(cmph_uint32)*bmz->n);
nbytes = fread(bmz->g, bmz->n*sizeof(cmph_uint32), (size_t)1, f);
if (nbytes == 0 && ferror(f)) {
fprintf(stderr, "ERROR: %s\n", strerror(errno));
return;
}
#ifdef DEBUG
fprintf(stderr, "G: ");
for (i = 0; i < bmz->n; ++i) fprintf(stderr, "%u ", bmz->g[i]);
fprintf(stderr, "\n");
#endif
return;
}
cmph_uint32 bmz_search(cmph_t *mphf, const char *key, cmph_uint32 keylen)
{
bmz_data_t *bmz = mphf->data;
cmph_uint32 h1 = hash(bmz->hashes[0], key, keylen) % bmz->n;
cmph_uint32 h2 = hash(bmz->hashes[1], key, keylen) % bmz->n;
DEBUGP("key: %s h1: %u h2: %u\n", key, h1, h2);
if (h1 == h2 && ++h2 > bmz->n) h2 = 0;
DEBUGP("key: %s g[h1]: %u g[h2]: %u edges: %u\n", key, bmz->g[h1], bmz->g[h2], bmz->m);
return bmz->g[h1] + bmz->g[h2];
}
void bmz_destroy(cmph_t *mphf)
{
bmz_data_t *data = (bmz_data_t *)mphf->data;
free(data->g);
hash_state_destroy(data->hashes[0]);
hash_state_destroy(data->hashes[1]);
free(data->hashes);
free(data);
free(mphf);
}
/** \fn void bmz_pack(cmph_t *mphf, void *packed_mphf);
* \brief Support the ability to pack a perfect hash function into a preallocated contiguous memory space pointed by packed_mphf.
* \param mphf pointer to the resulting mphf
* \param packed_mphf pointer to the contiguous memory area used to store the resulting mphf. The size of packed_mphf must be at least cmph_packed_size()
*/
void bmz_pack(cmph_t *mphf, void *packed_mphf)
{
bmz_data_t *data = (bmz_data_t *)mphf->data;
cmph_uint8 * ptr = packed_mphf;
CMPH_HASH h2_type;
// packing h1 type
CMPH_HASH h1_type = hash_get_type(data->hashes[0]);
*((cmph_uint32 *) ptr) = h1_type;
ptr += sizeof(cmph_uint32);
// packing h1
hash_state_pack(data->hashes[0], ptr);
ptr += hash_state_packed_size(h1_type);
// packing h2 type
h2_type = hash_get_type(data->hashes[1]);
*((cmph_uint32 *) ptr) = h2_type;
ptr += sizeof(cmph_uint32);
// packing h2
hash_state_pack(data->hashes[1], ptr);
ptr += hash_state_packed_size(h2_type);
// packing n
*((cmph_uint32 *) ptr) = data->n;
ptr += sizeof(data->n);
// packing g
memcpy(ptr, data->g, sizeof(cmph_uint32)*data->n);
}
/** \fn cmph_uint32 bmz_packed_size(cmph_t *mphf);
* \brief Return the amount of space needed to pack mphf.
* \param mphf pointer to a mphf
* \return the size of the packed function or zero for failures
*/
cmph_uint32 bmz_packed_size(cmph_t *mphf)
{
bmz_data_t *data = (bmz_data_t *)mphf->data;
CMPH_HASH h1_type = hash_get_type(data->hashes[0]);
CMPH_HASH h2_type = hash_get_type(data->hashes[1]);
return (cmph_uint32)(sizeof(CMPH_ALGO) + hash_state_packed_size(h1_type) + hash_state_packed_size(h2_type) +
3*sizeof(cmph_uint32) + sizeof(cmph_uint32)*data->n);
}
/** cmph_uint32 bmz_search(void *packed_mphf, const char *key, cmph_uint32 keylen);
* \brief Use the packed mphf to do a search.
* \param packed_mphf pointer to the packed mphf
* \param key key to be hashed
* \param keylen key legth in bytes
* \return The mphf value
*/
cmph_uint32 bmz_search_packed(void *packed_mphf, const char *key, cmph_uint32 keylen)
{
register cmph_uint8 *h1_ptr = packed_mphf;
register CMPH_HASH h1_type = *((cmph_uint32 *)h1_ptr);
register cmph_uint8 *h2_ptr;
register CMPH_HASH h2_type;
register cmph_uint32 *g_ptr, n, h1, h2;
h1_ptr += 4;
h2_ptr = h1_ptr + hash_state_packed_size(h1_type);
h2_type = *((cmph_uint32 *)h2_ptr);
h2_ptr += 4;
g_ptr = (cmph_uint32 *)(h2_ptr + hash_state_packed_size(h2_type));
n = *g_ptr++;
h1 = hash_packed(h1_ptr, h1_type, key, keylen) % n;
h2 = hash_packed(h2_ptr, h2_type, key, keylen) % n;
if (h1 == h2 && ++h2 > n) h2 = 0;
return (g_ptr[h1] + g_ptr[h2]);
}

42
girepository/cmph/bmz.h Normal file
View File

@ -0,0 +1,42 @@
#ifndef __CMPH_BMZ_H__
#define __CMPH_BMZ_H__
#include "cmph.h"
typedef struct __bmz_data_t bmz_data_t;
typedef struct __bmz_config_data_t bmz_config_data_t;
bmz_config_data_t *bmz_config_new(void);
void bmz_config_set_hashfuncs(cmph_config_t *mph, CMPH_HASH *hashfuncs);
void bmz_config_destroy(cmph_config_t *mph);
cmph_t *bmz_new(cmph_config_t *mph, double c);
void bmz_load(FILE *f, cmph_t *mphf);
int bmz_dump(cmph_t *mphf, FILE *f);
void bmz_destroy(cmph_t *mphf);
cmph_uint32 bmz_search(cmph_t *mphf, const char *key, cmph_uint32 keylen);
/** \fn void bmz_pack(cmph_t *mphf, void *packed_mphf);
* \brief Support the ability to pack a perfect hash function into a preallocated contiguous memory space pointed by packed_mphf.
* \param mphf pointer to the resulting mphf
* \param packed_mphf pointer to the contiguous memory area used to store the resulting mphf. The size of packed_mphf must be at least cmph_packed_size()
*/
void bmz_pack(cmph_t *mphf, void *packed_mphf);
/** \fn cmph_uint32 bmz_packed_size(cmph_t *mphf);
* \brief Return the amount of space needed to pack mphf.
* \param mphf pointer to a mphf
* \return the size of the packed function or zero for failures
*/
cmph_uint32 bmz_packed_size(cmph_t *mphf);
/** cmph_uint32 bmz_search(void *packed_mphf, const char *key, cmph_uint32 keylen);
* \brief Use the packed mphf to do a search.
* \param packed_mphf pointer to the packed mphf
* \param key key to be hashed
* \param keylen key legth in bytes
* \return The mphf value
*/
cmph_uint32 bmz_search_packed(void *packed_mphf, const char *key, cmph_uint32 keylen);
#endif

647
girepository/cmph/bmz8.c Normal file
View File

@ -0,0 +1,647 @@
#include "graph.h"
#include "bmz8.h"
#include "cmph_structs.h"
#include "bmz8_structs.h"
#include "hash.h"
#include "vqueue.h"
#include "bitbool.h"
#include <math.h>
#include <stdlib.h>
#include <stdio.h>
#include <assert.h>
#include <string.h>
#include <errno.h>
//#define DEBUG
#include "debug.h"
static int bmz8_gen_edges(cmph_config_t *mph);
static cmph_uint8 bmz8_traverse_critical_nodes(bmz8_config_data_t *bmz8, cmph_uint32 v, cmph_uint8 * biggest_g_value, cmph_uint8 * biggest_edge_value, cmph_uint8 * used_edges, cmph_uint8 * visited);
static cmph_uint8 bmz8_traverse_critical_nodes_heuristic(bmz8_config_data_t *bmz8, cmph_uint32 v, cmph_uint8 * biggest_g_value, cmph_uint8 * biggest_edge_value, cmph_uint8 * used_edges, cmph_uint8 * visited);
static void bmz8_traverse_non_critical_nodes(bmz8_config_data_t *bmz8, cmph_uint8 * used_edges, cmph_uint8 * visited);
bmz8_config_data_t *bmz8_config_new(void)
{
bmz8_config_data_t *bmz8;
bmz8 = (bmz8_config_data_t *)malloc(sizeof(bmz8_config_data_t));
assert(bmz8);
memset(bmz8, 0, sizeof(bmz8_config_data_t));
bmz8->hashfuncs[0] = CMPH_HASH_JENKINS;
bmz8->hashfuncs[1] = CMPH_HASH_JENKINS;
bmz8->g = NULL;
bmz8->graph = NULL;
bmz8->hashes = NULL;
return bmz8;
}
void bmz8_config_destroy(cmph_config_t *mph)
{
bmz8_config_data_t *data = (bmz8_config_data_t *)mph->data;
DEBUGP("Destroying algorithm dependent data\n");
free(data);
}
void bmz8_config_set_hashfuncs(cmph_config_t *mph, CMPH_HASH *hashfuncs)
{
bmz8_config_data_t *bmz8 = (bmz8_config_data_t *)mph->data;
CMPH_HASH *hashptr = hashfuncs;
cmph_uint8 i = 0;
while(*hashptr != CMPH_HASH_COUNT)
{
if (i >= 2) break; //bmz8 only uses two hash functions
bmz8->hashfuncs[i] = *hashptr;
++i, ++hashptr;
}
}
cmph_t *bmz8_new(cmph_config_t *mph, double c)
{
cmph_t *mphf = NULL;
bmz8_data_t *bmz8f = NULL;
cmph_uint8 i;
cmph_uint8 iterations;
cmph_uint8 iterations_map = 20;
cmph_uint8 *used_edges = NULL;
cmph_uint8 restart_mapping = 0;
cmph_uint8 * visited = NULL;
bmz8_config_data_t *bmz8 = (bmz8_config_data_t *)mph->data;
if (mph->key_source->nkeys >= 256)
{
if (mph->verbosity) fprintf(stderr, "The number of keys in BMZ8 must be lower than 256.\n");
return NULL;
}
if (c == 0) c = 1.15; // validating restrictions over parameter c.
DEBUGP("c: %f\n", c);
bmz8->m = (cmph_uint8) mph->key_source->nkeys;
bmz8->n = (cmph_uint8) ceil(c * mph->key_source->nkeys);
DEBUGP("m (edges): %u n (vertices): %u c: %f\n", bmz8->m, bmz8->n, c);
bmz8->graph = graph_new(bmz8->n, bmz8->m);
DEBUGP("Created graph\n");
bmz8->hashes = (hash_state_t **)malloc(sizeof(hash_state_t *)*3);
for(i = 0; i < 3; ++i) bmz8->hashes[i] = NULL;
do
{
// Mapping step
cmph_uint8 biggest_g_value = 0;
cmph_uint8 biggest_edge_value = 1;
iterations = 100;
if (mph->verbosity)
{
fprintf(stderr, "Entering mapping step for mph creation of %u keys with graph sized %u\n", bmz8->m, bmz8->n);
}
while(1)
{
int ok;
DEBUGP("hash function 1\n");
bmz8->hashes[0] = hash_state_new(bmz8->hashfuncs[0], bmz8->n);
DEBUGP("hash function 2\n");
bmz8->hashes[1] = hash_state_new(bmz8->hashfuncs[1], bmz8->n);
DEBUGP("Generating edges\n");
ok = bmz8_gen_edges(mph);
if (!ok)
{
--iterations;
hash_state_destroy(bmz8->hashes[0]);
bmz8->hashes[0] = NULL;
hash_state_destroy(bmz8->hashes[1]);
bmz8->hashes[1] = NULL;
DEBUGP("%u iterations remaining\n", iterations);
if (mph->verbosity)
{
fprintf(stderr, "simple graph creation failure - %u iterations remaining\n", iterations);
}
if (iterations == 0) break;
}
else break;
}
if (iterations == 0)
{
graph_destroy(bmz8->graph);
return NULL;
}
// Ordering step
if (mph->verbosity)
{
fprintf(stderr, "Starting ordering step\n");
}
graph_obtain_critical_nodes(bmz8->graph);
// Searching step
if (mph->verbosity)
{
fprintf(stderr, "Starting Searching step.\n");
fprintf(stderr, "\tTraversing critical vertices.\n");
}
DEBUGP("Searching step\n");
visited = (cmph_uint8 *)malloc((size_t)bmz8->n/8 + 1);
memset(visited, 0, (size_t)bmz8->n/8 + 1);
used_edges = (cmph_uint8 *)malloc((size_t)bmz8->m/8 + 1);
memset(used_edges, 0, (size_t)bmz8->m/8 + 1);
free(bmz8->g);
bmz8->g = (cmph_uint8 *)calloc((size_t)bmz8->n, sizeof(cmph_uint8));
assert(bmz8->g);
for (i = 0; i < bmz8->n; ++i) // critical nodes
{
if (graph_node_is_critical(bmz8->graph, i) && (!GETBIT(visited,i)))
{
if(c > 1.14) restart_mapping = bmz8_traverse_critical_nodes(bmz8, i, &biggest_g_value, &biggest_edge_value, used_edges, visited);
else restart_mapping = bmz8_traverse_critical_nodes_heuristic(bmz8, i, &biggest_g_value, &biggest_edge_value, used_edges, visited);
if(restart_mapping) break;
}
}
if(!restart_mapping)
{
if (mph->verbosity)
{
fprintf(stderr, "\tTraversing non critical vertices.\n");
}
bmz8_traverse_non_critical_nodes(bmz8, used_edges, visited); // non_critical_nodes
}
else
{
iterations_map--;
if (mph->verbosity) fprintf(stderr, "Restarting mapping step. %u iterations remaining.\n", iterations_map);
}
free(used_edges);
free(visited);
}while(restart_mapping && iterations_map > 0);
graph_destroy(bmz8->graph);
bmz8->graph = NULL;
if (iterations_map == 0)
{
return NULL;
}
mphf = (cmph_t *)malloc(sizeof(cmph_t));
mphf->algo = mph->algo;
bmz8f = (bmz8_data_t *)malloc(sizeof(bmz8_data_t));
bmz8f->g = bmz8->g;
bmz8->g = NULL; //transfer memory ownership
bmz8f->hashes = bmz8->hashes;
bmz8->hashes = NULL; //transfer memory ownership
bmz8f->n = bmz8->n;
bmz8f->m = bmz8->m;
mphf->data = bmz8f;
mphf->size = bmz8->m;
DEBUGP("Successfully generated minimal perfect hash\n");
if (mph->verbosity)
{
fprintf(stderr, "Successfully generated minimal perfect hash function\n");
}
return mphf;
}
static cmph_uint8 bmz8_traverse_critical_nodes(bmz8_config_data_t *bmz8, cmph_uint32 v, cmph_uint8 * biggest_g_value, cmph_uint8 * biggest_edge_value, cmph_uint8 * used_edges, cmph_uint8 * visited)
{
cmph_uint8 next_g;
cmph_uint32 u; /* Auxiliary vertex */
cmph_uint32 lav; /* lookahead vertex */
cmph_uint8 collision;
vqueue_t * q = vqueue_new((cmph_uint32)(graph_ncritical_nodes(bmz8->graph)));
graph_iterator_t it, it1;
DEBUGP("Labelling critical vertices\n");
bmz8->g[v] = (cmph_uint8)(ceil ((double)(*biggest_edge_value)/2) - 1);
SETBIT(visited, v);
next_g = (cmph_uint8)floor((double)(*biggest_edge_value/2)); /* next_g is incremented in the do..while statement*/
vqueue_insert(q, v);
while(!vqueue_is_empty(q))
{
v = vqueue_remove(q);
it = graph_neighbors_it(bmz8->graph, v);
while ((u = graph_next_neighbor(bmz8->graph, &it)) != GRAPH_NO_NEIGHBOR)
{
if (graph_node_is_critical(bmz8->graph, u) && (!GETBIT(visited,u)))
{
collision = 1;
while(collision) // lookahead to resolve collisions
{
next_g = (cmph_uint8)(*biggest_g_value + 1);
it1 = graph_neighbors_it(bmz8->graph, u);
collision = 0;
while((lav = graph_next_neighbor(bmz8->graph, &it1)) != GRAPH_NO_NEIGHBOR)
{
if (graph_node_is_critical(bmz8->graph, lav) && GETBIT(visited,lav))
{
if(next_g + bmz8->g[lav] >= bmz8->m)
{
vqueue_destroy(q);
return 1; // restart mapping step.
}
if (GETBIT(used_edges, (next_g + bmz8->g[lav])))
{
collision = 1;
break;
}
}
}
if (next_g > *biggest_g_value) *biggest_g_value = next_g;
}
// Marking used edges...
it1 = graph_neighbors_it(bmz8->graph, u);
while((lav = graph_next_neighbor(bmz8->graph, &it1)) != GRAPH_NO_NEIGHBOR)
{
if (graph_node_is_critical(bmz8->graph, lav) && GETBIT(visited, lav))
{
SETBIT(used_edges,(next_g + bmz8->g[lav]));
if(next_g + bmz8->g[lav] > *biggest_edge_value)
*biggest_edge_value = (cmph_uint8)(next_g + bmz8->g[lav]);
}
}
bmz8->g[u] = next_g; // Labelling vertex u.
SETBIT(visited,u);
vqueue_insert(q, u);
}
}
}
vqueue_destroy(q);
return 0;
}
static cmph_uint8 bmz8_traverse_critical_nodes_heuristic(bmz8_config_data_t *bmz8, cmph_uint32 v, cmph_uint8 * biggest_g_value, cmph_uint8 * biggest_edge_value, cmph_uint8 * used_edges, cmph_uint8 * visited)
{
cmph_uint8 next_g;
cmph_uint32 u;
cmph_uint32 lav;
cmph_uint8 collision;
cmph_uint8 * unused_g_values = NULL;
cmph_uint8 unused_g_values_capacity = 0;
cmph_uint8 nunused_g_values = 0;
vqueue_t * q = vqueue_new((cmph_uint32)(graph_ncritical_nodes(bmz8->graph)));
graph_iterator_t it, it1;
DEBUGP("Labelling critical vertices\n");
bmz8->g[v] = (cmph_uint8)(ceil ((double)(*biggest_edge_value)/2) - 1);
SETBIT(visited, v);
next_g = (cmph_uint8)floor((double)(*biggest_edge_value/2));
vqueue_insert(q, v);
while(!vqueue_is_empty(q))
{
v = vqueue_remove(q);
it = graph_neighbors_it(bmz8->graph, v);
while ((u = graph_next_neighbor(bmz8->graph, &it)) != GRAPH_NO_NEIGHBOR)
{
if (graph_node_is_critical(bmz8->graph, u) && (!GETBIT(visited,u)))
{
cmph_uint8 next_g_index = 0;
collision = 1;
while(collision) // lookahead to resolve collisions
{
if (next_g_index < nunused_g_values)
{
next_g = unused_g_values[next_g_index++];
}
else
{
next_g = (cmph_uint8)(*biggest_g_value + 1);
next_g_index = 255;//UINT_MAX;
}
it1 = graph_neighbors_it(bmz8->graph, u);
collision = 0;
while((lav = graph_next_neighbor(bmz8->graph, &it1)) != GRAPH_NO_NEIGHBOR)
{
if (graph_node_is_critical(bmz8->graph, lav) && GETBIT(visited,lav))
{
if(next_g + bmz8->g[lav] >= bmz8->m)
{
vqueue_destroy(q);
free(unused_g_values);
return 1; // restart mapping step.
}
if (GETBIT(used_edges, (next_g + bmz8->g[lav])))
{
collision = 1;
break;
}
}
}
if(collision && (next_g > *biggest_g_value)) // saving the current g value stored in next_g.
{
if(nunused_g_values == unused_g_values_capacity)
{
unused_g_values = (cmph_uint8*)realloc(unused_g_values, ((size_t)(unused_g_values_capacity + BUFSIZ))*sizeof(cmph_uint8));
unused_g_values_capacity += (cmph_uint8)BUFSIZ;
}
unused_g_values[nunused_g_values++] = next_g;
}
if (next_g > *biggest_g_value) *biggest_g_value = next_g;
}
next_g_index--;
if (next_g_index < nunused_g_values) unused_g_values[next_g_index] = unused_g_values[--nunused_g_values];
// Marking used edges...
it1 = graph_neighbors_it(bmz8->graph, u);
while((lav = graph_next_neighbor(bmz8->graph, &it1)) != GRAPH_NO_NEIGHBOR)
{
if (graph_node_is_critical(bmz8->graph, lav) && GETBIT(visited, lav))
{
SETBIT(used_edges,(next_g + bmz8->g[lav]));
if(next_g + bmz8->g[lav] > *biggest_edge_value)
*biggest_edge_value = (cmph_uint8)(next_g + bmz8->g[lav]);
}
}
bmz8->g[u] = next_g; // Labelling vertex u.
SETBIT(visited, u);
vqueue_insert(q, u);
}
}
}
vqueue_destroy(q);
free(unused_g_values);
return 0;
}
static cmph_uint8 next_unused_edge(bmz8_config_data_t *bmz8, cmph_uint8 * used_edges, cmph_uint32 unused_edge_index)
{
while(1)
{
assert(unused_edge_index < bmz8->m);
if(GETBIT(used_edges, unused_edge_index)) unused_edge_index ++;
else break;
}
return (cmph_uint8)unused_edge_index;
}
static void bmz8_traverse(bmz8_config_data_t *bmz8, cmph_uint8 * used_edges, cmph_uint32 v, cmph_uint8 * unused_edge_index, cmph_uint8 * visited)
{
graph_iterator_t it = graph_neighbors_it(bmz8->graph, v);
cmph_uint32 neighbor = 0;
while((neighbor = graph_next_neighbor(bmz8->graph, &it)) != GRAPH_NO_NEIGHBOR)
{
if(GETBIT(visited,neighbor)) continue;
//DEBUGP("Visiting neighbor %u\n", neighbor);
*unused_edge_index = next_unused_edge(bmz8, used_edges, *unused_edge_index);
bmz8->g[neighbor] = (cmph_uint8)(*unused_edge_index - bmz8->g[v]);
//if (bmz8->g[neighbor] >= bmz8->m) bmz8->g[neighbor] += bmz8->m;
SETBIT(visited, neighbor);
(*unused_edge_index)++;
bmz8_traverse(bmz8, used_edges, neighbor, unused_edge_index, visited);
}
}
static void bmz8_traverse_non_critical_nodes(bmz8_config_data_t *bmz8, cmph_uint8 * used_edges, cmph_uint8 * visited)
{
cmph_uint8 i, v1, v2, unused_edge_index = 0;
DEBUGP("Labelling non critical vertices\n");
for(i = 0; i < bmz8->m; i++)
{
v1 = (cmph_uint8)graph_vertex_id(bmz8->graph, i, 0);
v2 = (cmph_uint8)graph_vertex_id(bmz8->graph, i, 1);
if((GETBIT(visited,v1) && GETBIT(visited,v2)) || (!GETBIT(visited,v1) && !GETBIT(visited,v2))) continue;
if(GETBIT(visited,v1)) bmz8_traverse(bmz8, used_edges, v1, &unused_edge_index, visited);
else bmz8_traverse(bmz8, used_edges, v2, &unused_edge_index, visited);
}
for(i = 0; i < bmz8->n; i++)
{
if(!GETBIT(visited,i))
{
bmz8->g[i] = 0;
SETBIT(visited, i);
bmz8_traverse(bmz8, used_edges, i, &unused_edge_index, visited);
}
}
}
static int bmz8_gen_edges(cmph_config_t *mph)
{
cmph_uint8 e;
bmz8_config_data_t *bmz8 = (bmz8_config_data_t *)mph->data;
cmph_uint8 multiple_edges = 0;
DEBUGP("Generating edges for %u vertices\n", bmz8->n);
graph_clear_edges(bmz8->graph);
mph->key_source->rewind(mph->key_source->data);
for (e = 0; e < mph->key_source->nkeys; ++e)
{
cmph_uint8 h1, h2;
cmph_uint32 keylen;
char *key = NULL;
mph->key_source->read(mph->key_source->data, &key, &keylen);
// if (key == NULL)fprintf(stderr, "key = %s -- read BMZ\n", key);
h1 = (cmph_uint8)(hash(bmz8->hashes[0], key, keylen) % bmz8->n);
h2 = (cmph_uint8)(hash(bmz8->hashes[1], key, keylen) % bmz8->n);
if (h1 == h2) if (++h2 >= bmz8->n) h2 = 0;
if (h1 == h2)
{
if (mph->verbosity) fprintf(stderr, "Self loop for key %u\n", e);
mph->key_source->dispose(mph->key_source->data, key, keylen);
return 0;
}
//DEBUGP("Adding edge: %u -> %u for key %s\n", h1, h2, key);
mph->key_source->dispose(mph->key_source->data, key, keylen);
// fprintf(stderr, "key = %s -- dispose BMZ\n", key);
multiple_edges = graph_contains_edge(bmz8->graph, h1, h2);
if (mph->verbosity && multiple_edges) fprintf(stderr, "A non simple graph was generated\n");
if (multiple_edges) return 0; // checking multiple edge restriction.
graph_add_edge(bmz8->graph, h1, h2);
}
return !multiple_edges;
}
int bmz8_dump(cmph_t *mphf, FILE *fd)
{
char *buf = NULL;
cmph_uint32 buflen;
cmph_uint8 two = 2; //number of hash functions
bmz8_data_t *data = (bmz8_data_t *)mphf->data;
register size_t nbytes;
__cmph_dump(mphf, fd);
nbytes = fwrite(&two, sizeof(cmph_uint8), (size_t)1, fd);
hash_state_dump(data->hashes[0], &buf, &buflen);
DEBUGP("Dumping hash state with %u bytes to disk\n", buflen);
nbytes = fwrite(&buflen, sizeof(cmph_uint32), (size_t)1, fd);
nbytes = fwrite(buf, (size_t)buflen, (size_t)1, fd);
free(buf);
hash_state_dump(data->hashes[1], &buf, &buflen);
DEBUGP("Dumping hash state with %u bytes to disk\n", buflen);
nbytes = fwrite(&buflen, sizeof(cmph_uint32), (size_t)1, fd);
nbytes = fwrite(buf, (size_t)buflen, (size_t)1, fd);
free(buf);
nbytes = fwrite(&(data->n), sizeof(cmph_uint8), (size_t)1, fd);
nbytes = fwrite(&(data->m), sizeof(cmph_uint8), (size_t)1, fd);
nbytes = fwrite(data->g, sizeof(cmph_uint8)*(data->n), (size_t)1, fd);
if (nbytes == 0 && ferror(fd)) {
fprintf(stderr, "ERROR: %s\n", strerror(errno));
return 0;
}
/* #ifdef DEBUG
fprintf(stderr, "G: ");
for (i = 0; i < data->n; ++i) fprintf(stderr, "%u ", data->g[i]);
fprintf(stderr, "\n");
#endif*/
return 1;
}
void bmz8_load(FILE *f, cmph_t *mphf)
{
cmph_uint8 nhashes;
char *buf = NULL;
cmph_uint32 buflen;
cmph_uint8 i;
register size_t nbytes;
bmz8_data_t *bmz8 = (bmz8_data_t *)malloc(sizeof(bmz8_data_t));
DEBUGP("Loading bmz8 mphf\n");
mphf->data = bmz8;
nbytes = fread(&nhashes, sizeof(cmph_uint8), (size_t)1, f);
bmz8->hashes = (hash_state_t **)malloc(sizeof(hash_state_t *)*(size_t)(nhashes + 1));
bmz8->hashes[nhashes] = NULL;
DEBUGP("Reading %u hashes\n", nhashes);
for (i = 0; i < nhashes; ++i)
{
hash_state_t *state = NULL;
nbytes = fread(&buflen, sizeof(cmph_uint32), (size_t)1, f);
DEBUGP("Hash state has %u bytes\n", buflen);
buf = (char *)malloc((size_t)buflen);
nbytes = fread(buf, (size_t)buflen, (size_t)1, f);
state = hash_state_load(buf, buflen);
bmz8->hashes[i] = state;
free(buf);
}
DEBUGP("Reading m and n\n");
nbytes = fread(&(bmz8->n), sizeof(cmph_uint8), (size_t)1, f);
nbytes = fread(&(bmz8->m), sizeof(cmph_uint8), (size_t)1, f);
bmz8->g = (cmph_uint8 *)malloc(sizeof(cmph_uint8)*bmz8->n);
nbytes = fread(bmz8->g, bmz8->n*sizeof(cmph_uint8), (size_t)1, f);
if (nbytes == 0 && ferror(f)) {
fprintf(stderr, "ERROR: %s\n", strerror(errno));
return;
}
#ifdef DEBUG
fprintf(stderr, "G: ");
for (i = 0; i < bmz8->n; ++i) fprintf(stderr, "%u ", bmz8->g[i]);
fprintf(stderr, "\n");
#endif
return;
}
cmph_uint8 bmz8_search(cmph_t *mphf, const char *key, cmph_uint32 keylen)
{
bmz8_data_t *bmz8 = mphf->data;
cmph_uint8 h1 = (cmph_uint8)(hash(bmz8->hashes[0], key, keylen) % bmz8->n);
cmph_uint8 h2 = (cmph_uint8)(hash(bmz8->hashes[1], key, keylen) % bmz8->n);
DEBUGP("key: %s h1: %u h2: %u\n", key, h1, h2);
if (h1 == h2 && ++h2 > bmz8->n) h2 = 0;
DEBUGP("key: %s g[h1]: %u g[h2]: %u edges: %u\n", key, bmz8->g[h1], bmz8->g[h2], bmz8->m);
return (cmph_uint8)(bmz8->g[h1] + bmz8->g[h2]);
}
void bmz8_destroy(cmph_t *mphf)
{
bmz8_data_t *data = (bmz8_data_t *)mphf->data;
free(data->g);
hash_state_destroy(data->hashes[0]);
hash_state_destroy(data->hashes[1]);
free(data->hashes);
free(data);
free(mphf);
}
/** \fn void bmz8_pack(cmph_t *mphf, void *packed_mphf);
* \brief Support the ability to pack a perfect hash function into a preallocated contiguous memory space pointed by packed_mphf.
* \param mphf pointer to the resulting mphf
* \param packed_mphf pointer to the contiguous memory area used to store the resulting mphf. The size of packed_mphf must be at least cmph_packed_size()
*/
void bmz8_pack(cmph_t *mphf, void *packed_mphf)
{
bmz8_data_t *data = (bmz8_data_t *)mphf->data;
cmph_uint8 * ptr = packed_mphf;
CMPH_HASH h2_type;
// packing h1 type
CMPH_HASH h1_type = hash_get_type(data->hashes[0]);
*((cmph_uint32 *) ptr) = h1_type;
ptr += sizeof(cmph_uint32);
// packing h1
hash_state_pack(data->hashes[0], ptr);
ptr += hash_state_packed_size(h1_type);
// packing h2 type
h2_type = hash_get_type(data->hashes[1]);
*((cmph_uint32 *) ptr) = h2_type;
ptr += sizeof(cmph_uint32);
// packing h2
hash_state_pack(data->hashes[1], ptr);
ptr += hash_state_packed_size(h2_type);
// packing n
*ptr++ = data->n;
// packing g
memcpy(ptr, data->g, sizeof(cmph_uint8)*data->n);
}
/** \fn cmph_uint32 bmz8_packed_size(cmph_t *mphf);
* \brief Return the amount of space needed to pack mphf.
* \param mphf pointer to a mphf
* \return the size of the packed function or zero for failures
*/
cmph_uint32 bmz8_packed_size(cmph_t *mphf)
{
bmz8_data_t *data = (bmz8_data_t *)mphf->data;
CMPH_HASH h1_type = hash_get_type(data->hashes[0]);
CMPH_HASH h2_type = hash_get_type(data->hashes[1]);
return (cmph_uint32)(sizeof(CMPH_ALGO) + hash_state_packed_size(h1_type) + hash_state_packed_size(h2_type) +
2*sizeof(cmph_uint32) + sizeof(cmph_uint8) + sizeof(cmph_uint8)*data->n);
}
/** cmph_uint8 bmz8_search(void *packed_mphf, const char *key, cmph_uint32 keylen);
* \brief Use the packed mphf to do a search.
* \param packed_mphf pointer to the packed mphf
* \param key key to be hashed
* \param keylen key legth in bytes
* \return The mphf value
*/
cmph_uint8 bmz8_search_packed(void *packed_mphf, const char *key, cmph_uint32 keylen)
{
register cmph_uint8 *h1_ptr = packed_mphf;
register CMPH_HASH h1_type = *((cmph_uint32 *)h1_ptr);
register cmph_uint8 *h2_ptr;
register CMPH_HASH h2_type;
register cmph_uint8 *g_ptr, n, h1, h2;
h1_ptr += 4;
h2_ptr = h1_ptr + hash_state_packed_size(h1_type);
h2_type = *((cmph_uint32 *)h2_ptr);
h2_ptr += 4;
g_ptr = h2_ptr + hash_state_packed_size(h2_type);
n = *g_ptr++;
h1 = (cmph_uint8)(hash_packed(h1_ptr, h1_type, key, keylen) % n);
h2 = (cmph_uint8)(hash_packed(h2_ptr, h2_type, key, keylen) % n);
DEBUGP("key: %s h1: %u h2: %u\n", key, h1, h2);
if (h1 == h2 && ++h2 > n) h2 = 0;
return (cmph_uint8)(g_ptr[h1] + g_ptr[h2]);
}

42
girepository/cmph/bmz8.h Normal file
View File

@ -0,0 +1,42 @@
#ifndef __CMPH_BMZ8_H__
#define __CMPH_BMZ8_H__
#include "cmph.h"
typedef struct __bmz8_data_t bmz8_data_t;
typedef struct __bmz8_config_data_t bmz8_config_data_t;
bmz8_config_data_t *bmz8_config_new(void);
void bmz8_config_set_hashfuncs(cmph_config_t *mph, CMPH_HASH *hashfuncs);
void bmz8_config_destroy(cmph_config_t *mph);
cmph_t *bmz8_new(cmph_config_t *mph, double c);
void bmz8_load(FILE *f, cmph_t *mphf);
int bmz8_dump(cmph_t *mphf, FILE *f);
void bmz8_destroy(cmph_t *mphf);
cmph_uint8 bmz8_search(cmph_t *mphf, const char *key, cmph_uint32 keylen);
/** \fn void bmz8_pack(cmph_t *mphf, void *packed_mphf);
* \brief Support the ability to pack a perfect hash function into a preallocated contiguous memory space pointed by packed_mphf.
* \param mphf pointer to the resulting mphf
* \param packed_mphf pointer to the contiguous memory area used to store the resulting mphf. The size of packed_mphf must be at least cmph_packed_size()
*/
void bmz8_pack(cmph_t *mphf, void *packed_mphf);
/** \fn cmph_uint32 bmz8_packed_size(cmph_t *mphf);
* \brief Return the amount of space needed to pack mphf.
* \param mphf pointer to a mphf
* \return the size of the packed function or zero for failures
*/
cmph_uint32 bmz8_packed_size(cmph_t *mphf);
/** cmph_uint8 bmz8_search(void *packed_mphf, const char *key, cmph_uint32 keylen);
* \brief Use the packed mphf to do a search.
* \param packed_mphf pointer to the packed mphf
* \param key key to be hashed
* \param keylen key legth in bytes
* \return The mphf value
*/
cmph_uint8 bmz8_search_packed(void *packed_mphf, const char *key, cmph_uint32 keylen);
#endif

View File

@ -0,0 +1,25 @@
#ifndef __CMPH_BMZ8_STRUCTS_H__
#define __CMPH_BMZ8_STRUCTS_H__
#include "hash_state.h"
struct __bmz8_data_t
{
cmph_uint8 m; //edges (words) count
cmph_uint8 n; //vertex count
cmph_uint8 *g;
hash_state_t **hashes;
};
struct __bmz8_config_data_t
{
CMPH_HASH hashfuncs[2];
cmph_uint8 m; //edges (words) count
cmph_uint8 n; //vertex count
graph_t *graph;
cmph_uint8 *g;
hash_state_t **hashes;
};
#endif

View File

@ -0,0 +1,25 @@
#ifndef __CMPH_BMZ_STRUCTS_H__
#define __CMPH_BMZ_STRUCTS_H__
#include "hash_state.h"
struct __bmz_data_t
{
cmph_uint32 m; //edges (words) count
cmph_uint32 n; //vertex count
cmph_uint32 *g;
hash_state_t **hashes;
};
struct __bmz_config_data_t
{
CMPH_HASH hashfuncs[2];
cmph_uint32 m; //edges (words) count
cmph_uint32 n; //vertex count
graph_t *graph;
cmph_uint32 *g;
hash_state_t **hashes;
};
#endif

1040
girepository/cmph/brz.c Normal file

File diff suppressed because it is too large Load Diff

47
girepository/cmph/brz.h Normal file
View File

@ -0,0 +1,47 @@
#ifndef __CMPH_BRZ_H__
#define __CMPH_BRZ_H__
#include "cmph.h"
typedef struct __brz_data_t brz_data_t;
typedef struct __brz_config_data_t brz_config_data_t;
brz_config_data_t *brz_config_new(void);
void brz_config_set_hashfuncs(cmph_config_t *mph, CMPH_HASH *hashfuncs);
void brz_config_set_tmp_dir(cmph_config_t *mph, cmph_uint8 *tmp_dir);
void brz_config_set_mphf_fd(cmph_config_t *mph, FILE *mphf_fd);
void brz_config_set_b(cmph_config_t *mph, cmph_uint32 b);
void brz_config_set_algo(cmph_config_t *mph, CMPH_ALGO algo);
void brz_config_set_memory_availability(cmph_config_t *mph, cmph_uint32 memory_availability);
void brz_config_destroy(cmph_config_t *mph);
cmph_t *brz_new(cmph_config_t *mph, double c);
void brz_load(FILE *f, cmph_t *mphf);
int brz_dump(cmph_t *mphf, FILE *f);
void brz_destroy(cmph_t *mphf);
cmph_uint32 brz_search(cmph_t *mphf, const char *key, cmph_uint32 keylen);
/** \fn void brz_pack(cmph_t *mphf, void *packed_mphf);
* \brief Support the ability to pack a perfect hash function into a preallocated contiguous memory space pointed by packed_mphf.
* \param mphf pointer to the resulting mphf
* \param packed_mphf pointer to the contiguous memory area used to store the resulting mphf. The size of packed_mphf must be at least cmph_packed_size()
*/
void brz_pack(cmph_t *mphf, void *packed_mphf);
/** \fn cmph_uint32 brz_packed_size(cmph_t *mphf);
* \brief Return the amount of space needed to pack mphf.
* \param mphf pointer to a mphf
* \return the size of the packed function or zero for failures
*/
cmph_uint32 brz_packed_size(cmph_t *mphf);
/** cmph_uint32 brz_search(void *packed_mphf, const char *key, cmph_uint32 keylen);
* \brief Use the packed mphf to do a search.
* \param packed_mphf pointer to the packed mphf
* \param key key to be hashed
* \param keylen key legth in bytes
* \return The mphf value
*/
cmph_uint32 brz_search_packed(void *packed_mphf, const char *key, cmph_uint32 keylen);
#endif

View File

@ -0,0 +1,39 @@
#ifndef __CMPH_BRZ_STRUCTS_H__
#define __CMPH_BRZ_STRUCTS_H__
#include "hash_state.h"
struct __brz_data_t
{
CMPH_ALGO algo; // CMPH algo for generating the MPHFs for the buckets (Just CMPH_FCH and CMPH_BMZ8)
cmph_uint32 m; // edges (words) count
double c; // constant c
cmph_uint8 *size; // size[i] stores the number of edges represented by g[i][...].
cmph_uint32 *offset; // offset[i] stores the sum: size[0] + size[1] + ... size[i-1].
cmph_uint8 **g; // g function.
cmph_uint32 k; // number of components
hash_state_t **h1;
hash_state_t **h2;
hash_state_t * h0;
};
struct __brz_config_data_t
{
CMPH_HASH hashfuncs[3];
CMPH_ALGO algo; // CMPH algo for generating the MPHFs for the buckets (Just CMPH_FCH and CMPH_BMZ8)
double c; // constant c
cmph_uint32 m; // edges (words) count
cmph_uint8 *size; // size[i] stores the number of edges represented by g[i][...].
cmph_uint32 *offset; // offset[i] stores the sum: size[0] + size[1] + ... size[i-1].
cmph_uint8 **g; // g function.
cmph_uint8 b; // parameter b.
cmph_uint32 k; // number of components
hash_state_t **h1;
hash_state_t **h2;
hash_state_t * h0;
cmph_uint32 memory_availability;
cmph_uint8 * tmp_dir; // temporary directory
FILE * mphf_fd; // mphf file
};
#endif

View File

@ -0,0 +1,103 @@
#include "buffer_entry.h"
#include <stdio.h>
#include <assert.h>
#include <stdlib.h>
#include <string.h>
struct __buffer_entry_t
{
FILE *fd;
cmph_uint8 * buff;
cmph_uint32 capacity, // buffer entry capacity
nbytes, // buffer entry used bytes
pos; // current read position in buffer entry
cmph_uint8 eof; // flag to indicate end of file
};
buffer_entry_t * buffer_entry_new(cmph_uint32 capacity)
{
buffer_entry_t *buff_entry = (buffer_entry_t *)malloc(sizeof(buffer_entry_t));
assert(buff_entry);
buff_entry->fd = NULL;
buff_entry->buff = NULL;
buff_entry->capacity = capacity;
buff_entry->nbytes = capacity;
buff_entry->pos = capacity;
buff_entry->eof = 0;
return buff_entry;
}
void buffer_entry_open(buffer_entry_t * buffer_entry, char * filename)
{
buffer_entry->fd = fopen(filename, "rb");
}
void buffer_entry_set_capacity(buffer_entry_t * buffer_entry, cmph_uint32 capacity)
{
buffer_entry->capacity = capacity;
}
cmph_uint32 buffer_entry_get_capacity(buffer_entry_t * buffer_entry)
{
return buffer_entry->capacity;
}
static void buffer_entry_load(buffer_entry_t * buffer_entry)
{
free(buffer_entry->buff);
buffer_entry->buff = (cmph_uint8 *)calloc((size_t)buffer_entry->capacity, sizeof(cmph_uint8));
buffer_entry->nbytes = (cmph_uint32)fread(buffer_entry->buff, (size_t)1, (size_t)buffer_entry->capacity, buffer_entry->fd);
if (buffer_entry->nbytes != buffer_entry->capacity) buffer_entry->eof = 1;
buffer_entry->pos = 0;
}
cmph_uint8 * buffer_entry_read_key(buffer_entry_t * buffer_entry, cmph_uint32 * keylen)
{
cmph_uint8 * buf = NULL;
cmph_uint32 lacked_bytes = sizeof(*keylen);
cmph_uint32 copied_bytes = 0;
if(buffer_entry->eof && (buffer_entry->pos == buffer_entry->nbytes)) // end
{
free(buf);
return NULL;
}
if((buffer_entry->pos + lacked_bytes) > buffer_entry->nbytes)
{
copied_bytes = buffer_entry->nbytes - buffer_entry->pos;
lacked_bytes = (buffer_entry->pos + lacked_bytes) - buffer_entry->nbytes;
if (copied_bytes != 0) memcpy(keylen, buffer_entry->buff + buffer_entry->pos, (size_t)copied_bytes);
buffer_entry_load(buffer_entry);
}
memcpy(keylen + copied_bytes, buffer_entry->buff + buffer_entry->pos, (size_t)lacked_bytes);
buffer_entry->pos += lacked_bytes;
lacked_bytes = *keylen;
copied_bytes = 0;
buf = (cmph_uint8 *)malloc(*keylen + sizeof(*keylen));
memcpy(buf, keylen, sizeof(*keylen));
if((buffer_entry->pos + lacked_bytes) > buffer_entry->nbytes) {
copied_bytes = buffer_entry->nbytes - buffer_entry->pos;
lacked_bytes = (buffer_entry->pos + lacked_bytes) - buffer_entry->nbytes;
if (copied_bytes != 0) {
memcpy(buf + sizeof(*keylen), buffer_entry->buff + buffer_entry->pos, (size_t)copied_bytes);
}
buffer_entry_load(buffer_entry);
}
memcpy(buf+sizeof(*keylen)+copied_bytes, buffer_entry->buff + buffer_entry->pos, (size_t)lacked_bytes);
buffer_entry->pos += lacked_bytes;
return buf;
}
void buffer_entry_destroy(buffer_entry_t * buffer_entry)
{
fclose(buffer_entry->fd);
buffer_entry->fd = NULL;
free(buffer_entry->buff);
buffer_entry->buff = NULL;
buffer_entry->capacity = 0;
buffer_entry->nbytes = 0;
buffer_entry->pos = 0;
buffer_entry->eof = 0;
free(buffer_entry);
}

View File

@ -0,0 +1,14 @@
#ifndef __CMPH_BUFFER_ENTRY_H__
#define __CMPH_BUFFER_ENTRY_H__
#include "cmph_types.h"
#include <stdio.h>
typedef struct __buffer_entry_t buffer_entry_t;
buffer_entry_t * buffer_entry_new(cmph_uint32 capacity);
void buffer_entry_set_capacity(buffer_entry_t * buffer_entry, cmph_uint32 capacity);
cmph_uint32 buffer_entry_get_capacity(buffer_entry_t * buffer_entry);
void buffer_entry_open(buffer_entry_t * buffer_entry, char * filename);
cmph_uint8 * buffer_entry_read_key(buffer_entry_t * buffer_entry, cmph_uint32 * keylen);
void buffer_entry_destroy(buffer_entry_t * buffer_entry);
#endif

View File

@ -0,0 +1,66 @@
#include "buffer_manage.h"
#include "buffer_entry.h"
#include <stdio.h>
#include <assert.h>
#include <stdlib.h>
struct __buffer_manage_t
{
cmph_uint32 memory_avail; // memory available
buffer_entry_t ** buffer_entries; // buffer entries to be managed
cmph_uint32 nentries; // number of entries to be managed
cmph_uint32 *memory_avail_list; // memory available list
int pos_avail_list; // current position in memory available list
};
buffer_manage_t * buffer_manage_new(cmph_uint32 memory_avail, cmph_uint32 nentries)
{
cmph_uint32 memory_avail_entry, i;
buffer_manage_t *buff_manage = (buffer_manage_t *)malloc(sizeof(buffer_manage_t));
assert(buff_manage);
buff_manage->memory_avail = memory_avail;
buff_manage->buffer_entries = (buffer_entry_t **)calloc((size_t)nentries, sizeof(buffer_entry_t *));
buff_manage->memory_avail_list = (cmph_uint32 *)calloc((size_t)nentries, sizeof(cmph_uint32));
buff_manage->pos_avail_list = -1;
buff_manage->nentries = nentries;
memory_avail_entry = buff_manage->memory_avail/buff_manage->nentries + 1;
for(i = 0; i < buff_manage->nentries; i++)
{
buff_manage->buffer_entries[i] = buffer_entry_new(memory_avail_entry);
}
return buff_manage;
}
void buffer_manage_open(buffer_manage_t * buffer_manage, cmph_uint32 index, char * filename)
{
buffer_entry_open(buffer_manage->buffer_entries[index], filename);
}
cmph_uint8 * buffer_manage_read_key(buffer_manage_t * buffer_manage, cmph_uint32 index)
{
cmph_uint8 * key = NULL;
if (buffer_manage->pos_avail_list >= 0 ) // recovering memory
{
cmph_uint32 new_capacity = buffer_entry_get_capacity(buffer_manage->buffer_entries[index]) + buffer_manage->memory_avail_list[(buffer_manage->pos_avail_list)--];
buffer_entry_set_capacity(buffer_manage->buffer_entries[index], new_capacity);
//fprintf(stderr, "recovering memory\n");
}
key = buffer_entry_read_key(buffer_manage->buffer_entries[index]);
if (key == NULL) // storing memory to be recovered
{
buffer_manage->memory_avail_list[++(buffer_manage->pos_avail_list)] = buffer_entry_get_capacity(buffer_manage->buffer_entries[index]);
//fprintf(stderr, "storing memory to be recovered\n");
}
return key;
}
void buffer_manage_destroy(buffer_manage_t * buffer_manage)
{
cmph_uint32 i;
for(i = 0; i < buffer_manage->nentries; i++)
{
buffer_entry_destroy(buffer_manage->buffer_entries[i]);
}
free(buffer_manage->memory_avail_list);
free(buffer_manage->buffer_entries);
free(buffer_manage);
}

View File

@ -0,0 +1,12 @@
#ifndef __CMPH_BUFFER_MANAGE_H__
#define __CMPH_BUFFER_MANAGE_H__
#include "cmph_types.h"
#include <stdio.h>
typedef struct __buffer_manage_t buffer_manage_t;
buffer_manage_t * buffer_manage_new(cmph_uint32 memory_avail, cmph_uint32 nentries);
void buffer_manage_open(buffer_manage_t * buffer_manage, cmph_uint32 index, char * filename);
cmph_uint8 * buffer_manage_read_key(buffer_manage_t * buffer_manage, cmph_uint32 index);
void buffer_manage_destroy(buffer_manage_t * buffer_manage);
#endif

View File

@ -0,0 +1,64 @@
#include "buffer_manager.h"
#include "buffer_entry.h"
#include <stdio.h>
#include <assert.h>
#include <stdlib.h>
struct __buffer_manager_t
{
cmph_uint32 memory_avail; // memory available
buffer_entry_t ** buffer_entries; // buffer entries to be managed
cmph_uint32 nentries; // number of entries to be managed
cmph_uint32 *memory_avail_list; // memory available list
int pos_avail_list; // current position in memory available list
};
buffer_manager_t * buffer_manager_new(cmph_uint32 memory_avail, cmph_uint32 nentries)
{
cmph_uint32 memory_avail_entry, i;
buffer_manager_t *buff_manager = (buffer_manager_t *)malloc(sizeof(buffer_manager_t));
assert(buff_manager);
buff_manager->memory_avail = memory_avail;
buff_manager->buffer_entries = (buffer_entry_t **)calloc((size_t)nentries, sizeof(buffer_entry_t *));
buff_manager->memory_avail_list = (cmph_uint32 *)calloc((size_t)nentries, sizeof(cmph_uint32));
buff_manager->pos_avail_list = -1;
buff_manager->nentries = nentries;
memory_avail_entry = buff_manager->memory_avail/buff_manager->nentries + 1;
for(i = 0; i < buff_manager->nentries; i++)
{
buff_manager->buffer_entries[i] = buffer_entry_new(memory_avail_entry);
}
return buff_manager;
}
void buffer_manager_open(buffer_manager_t * buffer_manager, cmph_uint32 index, char * filename)
{
buffer_entry_open(buffer_manager->buffer_entries[index], filename);
}
cmph_uint8 * buffer_manager_read_key(buffer_manager_t * buffer_manager, cmph_uint32 index, cmph_uint32 * keylen)
{
cmph_uint8 * key = NULL;
if (buffer_manager->pos_avail_list >= 0 ) // recovering memory
{
cmph_uint32 new_capacity = buffer_entry_get_capacity(buffer_manager->buffer_entries[index]) + buffer_manager->memory_avail_list[(buffer_manager->pos_avail_list)--];
buffer_entry_set_capacity(buffer_manager->buffer_entries[index], new_capacity);
}
key = buffer_entry_read_key(buffer_manager->buffer_entries[index], keylen);
if (key == NULL) // storing memory to be recovered
{
buffer_manager->memory_avail_list[++(buffer_manager->pos_avail_list)] = buffer_entry_get_capacity(buffer_manager->buffer_entries[index]);
}
return key;
}
void buffer_manager_destroy(buffer_manager_t * buffer_manager)
{
cmph_uint32 i;
for(i = 0; i < buffer_manager->nentries; i++)
{
buffer_entry_destroy(buffer_manager->buffer_entries[i]);
}
free(buffer_manager->memory_avail_list);
free(buffer_manager->buffer_entries);
free(buffer_manager);
}

View File

@ -0,0 +1,12 @@
#ifndef __CMPH_BUFFER_MANAGE_H__
#define __CMPH_BUFFER_MANAGE_H__
#include "cmph_types.h"
#include <stdio.h>
typedef struct __buffer_manager_t buffer_manager_t;
buffer_manager_t * buffer_manager_new(cmph_uint32 memory_avail, cmph_uint32 nentries);
void buffer_manager_open(buffer_manager_t * buffer_manager, cmph_uint32 index, char * filename);
cmph_uint8 * buffer_manager_read_key(buffer_manager_t * buffer_manager, cmph_uint32 index, cmph_uint32 * keylen);
void buffer_manager_destroy(buffer_manager_t * buffer_manager);
#endif

280
girepository/cmph/chd.c Normal file
View File

@ -0,0 +1,280 @@
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
#include<math.h>
#include<time.h>
#include<assert.h>
#include<limits.h>
#include<errno.h>
#include "cmph_structs.h"
#include "chd_structs.h"
#include "chd.h"
#include "bitbool.h"
//#define DEBUG
#include "debug.h"
chd_config_data_t *chd_config_new(cmph_config_t *mph)
{
cmph_io_adapter_t *key_source = mph->key_source;
chd_config_data_t *chd;
chd = (chd_config_data_t *)malloc(sizeof(chd_config_data_t));
assert(chd);
memset(chd, 0, sizeof(chd_config_data_t));
chd->chd_ph = cmph_config_new(key_source);
cmph_config_set_algo(chd->chd_ph, CMPH_CHD_PH);
return chd;
}
void chd_config_destroy(cmph_config_t *mph)
{
chd_config_data_t *data = (chd_config_data_t *) mph->data;
DEBUGP("Destroying algorithm dependent data\n");
if(data->chd_ph)
{
cmph_config_destroy(data->chd_ph);
data->chd_ph = NULL;
}
free(data);
}
void chd_config_set_hashfuncs(cmph_config_t *mph, CMPH_HASH *hashfuncs)
{
chd_config_data_t *data = (chd_config_data_t *) mph->data;
cmph_config_set_hashfuncs(data->chd_ph, hashfuncs);
}
void chd_config_set_b(cmph_config_t *mph, cmph_uint32 keys_per_bucket)
{
chd_config_data_t *data = (chd_config_data_t *) mph->data;
cmph_config_set_b(data->chd_ph, keys_per_bucket);
}
void chd_config_set_keys_per_bin(cmph_config_t *mph, cmph_uint32 keys_per_bin)
{
chd_config_data_t *data = (chd_config_data_t *) mph->data;
cmph_config_set_keys_per_bin(data->chd_ph, keys_per_bin);
}
cmph_t *chd_new(cmph_config_t *mph, double c)
{
cmph_t *mphf = NULL;
chd_data_t *chdf = NULL;
chd_config_data_t *chd = (chd_config_data_t *)mph->data;
chd_ph_config_data_t * chd_ph = (chd_ph_config_data_t *)chd->chd_ph->data;
compressed_rank_t cr;
register cmph_t * chd_phf = NULL;
register cmph_uint32 packed_chd_phf_size = 0;
cmph_uint8 * packed_chd_phf = NULL;
register cmph_uint32 packed_cr_size = 0;
cmph_uint8 * packed_cr = NULL;
register cmph_uint32 i, idx, nkeys, nvals, nbins;
cmph_uint32 * vals_table = NULL;
register cmph_uint32 * occup_table = NULL;
#ifdef CMPH_TIMING
double construction_time_begin = 0.0;
double construction_time = 0.0;
ELAPSED_TIME_IN_SECONDS(&construction_time_begin);
#endif
cmph_config_set_verbosity(chd->chd_ph, mph->verbosity);
cmph_config_set_graphsize(chd->chd_ph, c);
if (mph->verbosity)
{
fprintf(stderr, "Generating a CHD_PH perfect hash function with a load factor equal to %.3f\n", c);
}
chd_phf = cmph_new(chd->chd_ph);
if(chd_phf == NULL)
{
return NULL;
}
packed_chd_phf_size = cmph_packed_size(chd_phf);
DEBUGP("packed_chd_phf_size = %u\n", packed_chd_phf_size);
/* Make sure that we have enough space to pack the mphf. */
packed_chd_phf = calloc((size_t)packed_chd_phf_size,(size_t)1);
/* Pack the mphf. */
cmph_pack(chd_phf, packed_chd_phf);
cmph_destroy(chd_phf);
if (mph->verbosity)
{
fprintf(stderr, "Compressing the range of the resulting CHD_PH perfect hash function\n");
}
compressed_rank_init(&cr);
nbins = chd_ph->n;
nkeys = chd_ph->m;
nvals = nbins - nkeys;
vals_table = (cmph_uint32 *)calloc(nvals, sizeof(cmph_uint32));
occup_table = (cmph_uint32 *)chd_ph->occup_table;
for(i = 0, idx = 0; i < nbins; i++)
{
if(!GETBIT32(occup_table, i))
{
vals_table[idx++] = i;
}
}
compressed_rank_generate(&cr, vals_table, nvals);
free(vals_table);
packed_cr_size = compressed_rank_packed_size(&cr);
packed_cr = (cmph_uint8 *) calloc(packed_cr_size, sizeof(cmph_uint8));
compressed_rank_pack(&cr, packed_cr);
compressed_rank_destroy(&cr);
mphf = (cmph_t *)malloc(sizeof(cmph_t));
mphf->algo = mph->algo;
chdf = (chd_data_t *)malloc(sizeof(chd_data_t));
chdf->packed_cr = packed_cr;
packed_cr = NULL; //transfer memory ownership
chdf->packed_chd_phf = packed_chd_phf;
packed_chd_phf = NULL; //transfer memory ownership
chdf->packed_chd_phf_size = packed_chd_phf_size;
chdf->packed_cr_size = packed_cr_size;
mphf->data = chdf;
mphf->size = nkeys;
DEBUGP("Successfully generated minimal perfect hash\n");
if (mph->verbosity)
{
fprintf(stderr, "Successfully generated minimal perfect hash function\n");
}
#ifdef CMPH_TIMING
ELAPSED_TIME_IN_SECONDS(&construction_time);
register cmph_uint32 space_usage = chd_packed_size(mphf)*8;
construction_time = construction_time - construction_time_begin;
fprintf(stdout, "%u\t%.2f\t%u\t%.4f\t%.4f\n", nkeys, c, chd_ph->keys_per_bucket, construction_time, space_usage/(double)nkeys);
#endif
return mphf;
}
void chd_load(FILE *fd, cmph_t *mphf)
{
register size_t nbytes;
chd_data_t *chd = (chd_data_t *)malloc(sizeof(chd_data_t));
DEBUGP("Loading chd mphf\n");
mphf->data = chd;
nbytes = fread(&chd->packed_chd_phf_size, sizeof(cmph_uint32), (size_t)1, fd);
DEBUGP("Loading CHD_PH perfect hash function with %u bytes to disk\n", chd->packed_chd_phf_size);
chd->packed_chd_phf = (cmph_uint8 *) calloc((size_t)chd->packed_chd_phf_size,(size_t)1);
nbytes = fread(chd->packed_chd_phf, chd->packed_chd_phf_size, (size_t)1, fd);
nbytes = fread(&chd->packed_cr_size, sizeof(cmph_uint32), (size_t)1, fd);
DEBUGP("Loading Compressed rank structure, which has %u bytes\n", chd->packed_cr_size);
chd->packed_cr = (cmph_uint8 *) calloc((size_t)chd->packed_cr_size, (size_t)1);
nbytes = fread(chd->packed_cr, chd->packed_cr_size, (size_t)1, fd);
if (nbytes == 0 && ferror(fd)) {
fprintf(stderr, "ERROR: %s\n", strerror(errno));
}
}
int chd_dump(cmph_t *mphf, FILE *fd)
{
register size_t nbytes;
chd_data_t *data = (chd_data_t *)mphf->data;
__cmph_dump(mphf, fd);
// Dumping CHD_PH perfect hash function
DEBUGP("Dumping CHD_PH perfect hash function with %u bytes to disk\n", data->packed_chd_phf_size);
nbytes = fwrite(&data->packed_chd_phf_size, sizeof(cmph_uint32), (size_t)1, fd);
nbytes = fwrite(data->packed_chd_phf, data->packed_chd_phf_size, (size_t)1, fd);
DEBUGP("Dumping compressed rank structure with %u bytes to disk\n", data->packed_cr_size);
nbytes = fwrite(&data->packed_cr_size, sizeof(cmph_uint32), (size_t)1, fd);
nbytes = fwrite(data->packed_cr, data->packed_cr_size, (size_t)1, fd);
if (nbytes == 0 && ferror(fd)) {
fprintf(stderr, "ERROR: %s\n", strerror(errno));
return 0;
}
return 1;
}
void chd_destroy(cmph_t *mphf)
{
chd_data_t *data = (chd_data_t *)mphf->data;
free(data->packed_chd_phf);
free(data->packed_cr);
free(data);
free(mphf);
}
static inline cmph_uint32 _chd_search(void * packed_chd_phf, void * packed_cr, const char *key, cmph_uint32 keylen)
{
register cmph_uint32 bin_idx = cmph_search_packed(packed_chd_phf, key, keylen);
register cmph_uint32 rank = compressed_rank_query_packed(packed_cr, bin_idx);
return bin_idx - rank;
}
cmph_uint32 chd_search(cmph_t *mphf, const char *key, cmph_uint32 keylen)
{
register chd_data_t * chd = mphf->data;
return _chd_search(chd->packed_chd_phf, chd->packed_cr, key, keylen);
}
void chd_pack(cmph_t *mphf, void *packed_mphf)
{
chd_data_t *data = (chd_data_t *)mphf->data;
cmph_uint32 * ptr = packed_mphf;
cmph_uint8 * ptr8;
// packing packed_cr_size and packed_cr
*ptr = data->packed_cr_size;
ptr8 = (cmph_uint8 *) (ptr + 1);
memcpy(ptr8, data->packed_cr, data->packed_cr_size);
ptr8 += data->packed_cr_size;
ptr = (cmph_uint32 *) ptr8;
*ptr = data->packed_chd_phf_size;
ptr8 = (cmph_uint8 *) (ptr + 1);
memcpy(ptr8, data->packed_chd_phf, data->packed_chd_phf_size);
}
cmph_uint32 chd_packed_size(cmph_t *mphf)
{
register chd_data_t *data = (chd_data_t *)mphf->data;
return (cmph_uint32)(sizeof(CMPH_ALGO) + 2*sizeof(cmph_uint32) + data->packed_cr_size + data->packed_chd_phf_size);
}
cmph_uint32 chd_search_packed(void *packed_mphf, const char *key, cmph_uint32 keylen)
{
register cmph_uint32 * ptr = packed_mphf;
register cmph_uint32 packed_cr_size = *ptr++;
register cmph_uint8 * packed_chd_phf = ((cmph_uint8 *) ptr) + packed_cr_size + sizeof(cmph_uint32);
return _chd_search(packed_chd_phf, ptr, key, keylen);
}

59
girepository/cmph/chd.h Normal file
View File

@ -0,0 +1,59 @@
#ifndef _CMPH_CHD_H__
#define _CMPH_CHD_H__
#include "cmph.h"
typedef struct __chd_data_t chd_data_t;
typedef struct __chd_config_data_t chd_config_data_t;
/* Config API */
chd_config_data_t *chd_config_new(cmph_config_t * mph);
void chd_config_set_hashfuncs(cmph_config_t *mph, CMPH_HASH *hashfuncs);
/** \fn void chd_config_set_keys_per_bin(cmph_config_t *mph, cmph_uint32 keys_per_bin);
* \brief Allows to set the number of keys per bin.
* \param mph pointer to the configuration structure
* \param keys_per_bin value for the number of keys per bin
*/
void chd_config_set_keys_per_bin(cmph_config_t *mph, cmph_uint32 keys_per_bin);
/** \fn void chd_config_set_b(cmph_config_t *mph, cmph_uint32 keys_per_bucket);
* \brief Allows to set the number of keys per bucket.
* \param mph pointer to the configuration structure
* \param keys_per_bucket value for the number of keys per bucket
*/
void chd_config_set_b(cmph_config_t *mph, cmph_uint32 keys_per_bucket);
void chd_config_destroy(cmph_config_t *mph);
/* Chd algorithm API */
cmph_t *chd_new(cmph_config_t *mph, double c);
void chd_load(FILE *fd, cmph_t *mphf);
int chd_dump(cmph_t *mphf, FILE *fd);
void chd_destroy(cmph_t *mphf);
cmph_uint32 chd_search(cmph_t *mphf, const char *key, cmph_uint32 keylen);
/** \fn void chd_pack(cmph_t *mphf, void *packed_mphf);
* \brief Support the ability to pack a perfect hash function into a preallocated contiguous memory space pointed by packed_mphf.
* \param mphf pointer to the resulting mphf
* \param packed_mphf pointer to the contiguous memory area used to store the resulting mphf. The size of packed_mphf must be at least cmph_packed_size()
*/
void chd_pack(cmph_t *mphf, void *packed_mphf);
/** \fn cmph_uint32 chd_packed_size(cmph_t *mphf);
* \brief Return the amount of space needed to pack mphf.
* \param mphf pointer to a mphf
* \return the size of the packed function or zero for failures
*/
cmph_uint32 chd_packed_size(cmph_t *mphf);
/** cmph_uint32 chd_search(void *packed_mphf, const char *key, cmph_uint32 keylen);
* \brief Use the packed mphf to do a search.
* \param packed_mphf pointer to the packed mphf
* \param key key to be hashed
* \param keylen key legth in bytes
* \return The mphf value
*/
cmph_uint32 chd_search_packed(void *packed_mphf, const char *key, cmph_uint32 keylen);
#endif

1001
girepository/cmph/chd_ph.c Normal file

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,59 @@
#ifndef _CMPH_CHD_PH_H__
#define _CMPH_CHD_PH_H__
#include "cmph.h"
typedef struct __chd_ph_data_t chd_ph_data_t;
typedef struct __chd_ph_config_data_t chd_ph_config_data_t;
/* Config API */
chd_ph_config_data_t *chd_ph_config_new(void);
void chd_ph_config_set_hashfuncs(cmph_config_t *mph, CMPH_HASH *hashfuncs);
/** \fn void chd_ph_config_set_keys_per_bin(cmph_config_t *mph, cmph_uint32 keys_per_bin);
* \brief Allows to set the number of keys per bin.
* \param mph pointer to the configuration structure
* \param keys_per_bin value for the number of keys per bin
*/
void chd_ph_config_set_keys_per_bin(cmph_config_t *mph, cmph_uint32 keys_per_bin);
/** \fn void chd_ph_config_set_b(cmph_config_t *mph, cmph_uint32 keys_per_bucket);
* \brief Allows to set the number of keys per bucket.
* \param mph pointer to the configuration structure
* \param keys_per_bucket value for the number of keys per bucket
*/
void chd_ph_config_set_b(cmph_config_t *mph, cmph_uint32 keys_per_bucket);
void chd_ph_config_destroy(cmph_config_t *mph);
/* Chd algorithm API */
cmph_t *chd_ph_new(cmph_config_t *mph, double c);
void chd_ph_load(FILE *fd, cmph_t *mphf);
int chd_ph_dump(cmph_t *mphf, FILE *fd);
void chd_ph_destroy(cmph_t *mphf);
cmph_uint32 chd_ph_search(cmph_t *mphf, const char *key, cmph_uint32 keylen);
/** \fn void chd_ph_pack(cmph_t *mphf, void *packed_mphf);
* \brief Support the ability to pack a perfect hash function into a preallocated contiguous memory space pointed by packed_mphf.
* \param mphf pointer to the resulting mphf
* \param packed_mphf pointer to the contiguous memory area used to store the resulting mphf. The size of packed_mphf must be at least cmph_packed_size()
*/
void chd_ph_pack(cmph_t *mphf, void *packed_mphf);
/** \fn cmph_uint32 chd_ph_packed_size(cmph_t *mphf);
* \brief Return the amount of space needed to pack mphf.
* \param mphf pointer to a mphf
* \return the size of the packed function or zero for failures
*/
cmph_uint32 chd_ph_packed_size(cmph_t *mphf);
/** cmph_uint32 chd_ph_search(void *packed_mphf, const char *key, cmph_uint32 keylen);
* \brief Use the packed mphf to do a search.
* \param packed_mphf pointer to the packed mphf
* \param key key to be hashed
* \param keylen key legth in bytes
* \return The mphf value
*/
cmph_uint32 chd_ph_search_packed(void *packed_mphf, const char *key, cmph_uint32 keylen);
#endif

View File

@ -0,0 +1,21 @@
#ifndef __CMPH_CHD_STRUCTS_H__
#define __CMPH_CHD_STRUCTS_H__
#include "chd_structs_ph.h"
#include "chd_ph.h"
#include "compressed_rank.h"
struct __chd_data_t
{
cmph_uint32 packed_cr_size;
cmph_uint8 * packed_cr; // packed compressed rank structure to control the number of zeros in a bit vector
cmph_uint32 packed_chd_phf_size;
cmph_uint8 * packed_chd_phf;
};
struct __chd_config_data_t
{
cmph_config_t *chd_ph; // chd_ph algorithm must be used here
};
#endif

View File

@ -0,0 +1,29 @@
#ifndef __CMPH_CHD_PH_STRUCTS_H__
#define __CMPH_CHD_PH_STRUCTS_H__
#include "hash_state.h"
#include "compressed_seq.h"
struct __chd_ph_data_t
{
compressed_seq_t * cs; // compressed displacement values
cmph_uint32 nbuckets; // number of buckets
cmph_uint32 n; // number of bins
hash_state_t *hl; // linear hash function
};
struct __chd_ph_config_data_t
{
CMPH_HASH hashfunc; // linear hash function to be used
compressed_seq_t * cs; // compressed displacement values
cmph_uint32 nbuckets; // number of buckets
cmph_uint32 n; // number of bins
hash_state_t *hl; // linear hash function
cmph_uint32 m; // number of keys
cmph_uint8 use_h; // flag to indicate the of use of a heuristic (use_h = 1)
cmph_uint32 keys_per_bin;//maximum number of keys per bin
cmph_uint32 keys_per_bucket; // average number of keys per bucket
cmph_uint8 *occup_table; // table that indicates occupied positions
};
#endif

396
girepository/cmph/chm.c Normal file
View File

@ -0,0 +1,396 @@
#include "graph.h"
#include "chm.h"
#include "cmph_structs.h"
#include "chm_structs.h"
#include "hash.h"
#include "bitbool.h"
#include <math.h>
#include <stdlib.h>
#include <stdio.h>
#include <assert.h>
#include <string.h>
#include <errno.h>
//#define DEBUG
#include "debug.h"
static int chm_gen_edges(cmph_config_t *mph);
static void chm_traverse(chm_config_data_t *chm, cmph_uint8 *visited, cmph_uint32 v);
chm_config_data_t *chm_config_new(void)
{
chm_config_data_t *chm = NULL;
chm = (chm_config_data_t *)malloc(sizeof(chm_config_data_t));
assert(chm);
memset(chm, 0, sizeof(chm_config_data_t));
chm->hashfuncs[0] = CMPH_HASH_JENKINS;
chm->hashfuncs[1] = CMPH_HASH_JENKINS;
chm->g = NULL;
chm->graph = NULL;
chm->hashes = NULL;
return chm;
}
void chm_config_destroy(cmph_config_t *mph)
{
chm_config_data_t *data = (chm_config_data_t *)mph->data;
DEBUGP("Destroying algorithm dependent data\n");
free(data);
}
void chm_config_set_hashfuncs(cmph_config_t *mph, CMPH_HASH *hashfuncs)
{
chm_config_data_t *chm = (chm_config_data_t *)mph->data;
CMPH_HASH *hashptr = hashfuncs;
cmph_uint32 i = 0;
while(*hashptr != CMPH_HASH_COUNT)
{
if (i >= 2) break; //chm only uses two hash functions
chm->hashfuncs[i] = *hashptr;
++i, ++hashptr;
}
}
cmph_t *chm_new(cmph_config_t *mph, double c)
{
cmph_t *mphf = NULL;
chm_data_t *chmf = NULL;
cmph_uint32 i;
cmph_uint32 iterations = 20;
cmph_uint8 *visited = NULL;
chm_config_data_t *chm = (chm_config_data_t *)mph->data;
chm->m = mph->key_source->nkeys;
if (c == 0) c = 2.09;
chm->n = (cmph_uint32)ceil(c * mph->key_source->nkeys);
DEBUGP("m (edges): %u n (vertices): %u c: %f\n", chm->m, chm->n, c);
chm->graph = graph_new(chm->n, chm->m);
DEBUGP("Created graph\n");
chm->hashes = (hash_state_t **)malloc(sizeof(hash_state_t *)*3);
for(i = 0; i < 3; ++i) chm->hashes[i] = NULL;
//Mapping step
if (mph->verbosity)
{
fprintf(stderr, "Entering mapping step for mph creation of %u keys with graph sized %u\n", chm->m, chm->n);
}
while(1)
{
int ok;
chm->hashes[0] = hash_state_new(chm->hashfuncs[0], chm->n);
chm->hashes[1] = hash_state_new(chm->hashfuncs[1], chm->n);
ok = chm_gen_edges(mph);
if (!ok)
{
--iterations;
hash_state_destroy(chm->hashes[0]);
chm->hashes[0] = NULL;
hash_state_destroy(chm->hashes[1]);
chm->hashes[1] = NULL;
DEBUGP("%u iterations remaining\n", iterations);
if (mph->verbosity)
{
fprintf(stderr, "Acyclic graph creation failure - %u iterations remaining\n", iterations);
}
if (iterations == 0) break;
}
else break;
}
if (iterations == 0)
{
graph_destroy(chm->graph);
return NULL;
}
//Assignment step
if (mph->verbosity)
{
fprintf(stderr, "Starting assignment step\n");
}
DEBUGP("Assignment step\n");
visited = (cmph_uint8 *)malloc((size_t)(chm->n/8 + 1));
memset(visited, 0, (size_t)(chm->n/8 + 1));
free(chm->g);
chm->g = (cmph_uint32 *)malloc(chm->n * sizeof(cmph_uint32));
assert(chm->g);
for (i = 0; i < chm->n; ++i)
{
if (!GETBIT(visited,i))
{
chm->g[i] = 0;
chm_traverse(chm, visited, i);
}
}
graph_destroy(chm->graph);
free(visited);
chm->graph = NULL;
mphf = (cmph_t *)malloc(sizeof(cmph_t));
mphf->algo = mph->algo;
chmf = (chm_data_t *)malloc(sizeof(chm_data_t));
chmf->g = chm->g;
chm->g = NULL; //transfer memory ownership
chmf->hashes = chm->hashes;
chm->hashes = NULL; //transfer memory ownership
chmf->n = chm->n;
chmf->m = chm->m;
mphf->data = chmf;
mphf->size = chm->m;
DEBUGP("Successfully generated minimal perfect hash\n");
if (mph->verbosity)
{
fprintf(stderr, "Successfully generated minimal perfect hash function\n");
}
return mphf;
}
static void chm_traverse(chm_config_data_t *chm, cmph_uint8 *visited, cmph_uint32 v)
{
graph_iterator_t it = graph_neighbors_it(chm->graph, v);
cmph_uint32 neighbor = 0;
SETBIT(visited,v);
DEBUGP("Visiting vertex %u\n", v);
while((neighbor = graph_next_neighbor(chm->graph, &it)) != GRAPH_NO_NEIGHBOR)
{
DEBUGP("Visiting neighbor %u\n", neighbor);
if(GETBIT(visited,neighbor)) continue;
DEBUGP("Visiting neighbor %u\n", neighbor);
DEBUGP("Visiting edge %u->%u with id %u\n", v, neighbor, graph_edge_id(chm->graph, v, neighbor));
chm->g[neighbor] = graph_edge_id(chm->graph, v, neighbor) - chm->g[v];
DEBUGP("g is %u (%u - %u mod %u)\n", chm->g[neighbor], graph_edge_id(chm->graph, v, neighbor), chm->g[v], chm->m);
chm_traverse(chm, visited, neighbor);
}
}
static int chm_gen_edges(cmph_config_t *mph)
{
cmph_uint32 e;
chm_config_data_t *chm = (chm_config_data_t *)mph->data;
int cycles = 0;
DEBUGP("Generating edges for %u vertices with hash functions %s and %s\n", chm->n, cmph_hash_names[chm->hashfuncs[0]], cmph_hash_names[chm->hashfuncs[1]]);
graph_clear_edges(chm->graph);
mph->key_source->rewind(mph->key_source->data);
for (e = 0; e < mph->key_source->nkeys; ++e)
{
cmph_uint32 h1, h2;
cmph_uint32 keylen;
char *key;
mph->key_source->read(mph->key_source->data, &key, &keylen);
h1 = hash(chm->hashes[0], key, keylen) % chm->n;
h2 = hash(chm->hashes[1], key, keylen) % chm->n;
if (h1 == h2) if (++h2 >= chm->n) h2 = 0;
if (h1 == h2)
{
if (mph->verbosity) fprintf(stderr, "Self loop for key %u\n", e);
mph->key_source->dispose(mph->key_source->data, key, keylen);
return 0;
}
DEBUGP("Adding edge: %u -> %u for key %s\n", h1, h2, key);
mph->key_source->dispose(mph->key_source->data, key, keylen);
graph_add_edge(chm->graph, h1, h2);
}
cycles = graph_is_cyclic(chm->graph);
if (mph->verbosity && cycles) fprintf(stderr, "Cyclic graph generated\n");
DEBUGP("Looking for cycles: %u\n", cycles);
return ! cycles;
}
int chm_dump(cmph_t *mphf, FILE *fd)
{
char *buf = NULL;
cmph_uint32 buflen;
cmph_uint32 two = 2; //number of hash functions
chm_data_t *data = (chm_data_t *)mphf->data;
register size_t nbytes;
__cmph_dump(mphf, fd);
nbytes = fwrite(&two, sizeof(cmph_uint32), (size_t)1, fd);
hash_state_dump(data->hashes[0], &buf, &buflen);
DEBUGP("Dumping hash state with %u bytes to disk\n", buflen);
nbytes = fwrite(&buflen, sizeof(cmph_uint32), (size_t)1, fd);
nbytes = fwrite(buf, (size_t)buflen, (size_t)1, fd);
free(buf);
hash_state_dump(data->hashes[1], &buf, &buflen);
DEBUGP("Dumping hash state with %u bytes to disk\n", buflen);
nbytes = fwrite(&buflen, sizeof(cmph_uint32), (size_t)1, fd);
nbytes = fwrite(buf, (size_t)buflen, (size_t)1, fd);
free(buf);
nbytes = fwrite(&(data->n), sizeof(cmph_uint32), (size_t)1, fd);
nbytes = fwrite(&(data->m), sizeof(cmph_uint32), (size_t)1, fd);
nbytes = fwrite(data->g, sizeof(cmph_uint32)*data->n, (size_t)1, fd);
if (nbytes == 0 && ferror(fd)) {
fprintf(stderr, "ERROR: %s\n", strerror(errno));
return 0;
}
/* #ifdef DEBUG
fprintf(stderr, "G: ");
for (i = 0; i < data->n; ++i) fprintf(stderr, "%u ", data->g[i]);
fprintf(stderr, "\n");
#endif*/
return 1;
}
void chm_load(FILE *f, cmph_t *mphf)
{
cmph_uint32 nhashes;
char *buf = NULL;
cmph_uint32 buflen;
cmph_uint32 i;
chm_data_t *chm = (chm_data_t *)malloc(sizeof(chm_data_t));
register size_t nbytes;
DEBUGP("Loading chm mphf\n");
mphf->data = chm;
nbytes = fread(&nhashes, sizeof(cmph_uint32), (size_t)1, f);
chm->hashes = (hash_state_t **)malloc(sizeof(hash_state_t *)*(nhashes + 1));
chm->hashes[nhashes] = NULL;
DEBUGP("Reading %u hashes\n", nhashes);
for (i = 0; i < nhashes; ++i)
{
hash_state_t *state = NULL;
nbytes = fread(&buflen, sizeof(cmph_uint32), (size_t)1, f);
DEBUGP("Hash state has %u bytes\n", buflen);
buf = (char *)malloc((size_t)buflen);
nbytes = fread(buf, (size_t)buflen, (size_t)1, f);
state = hash_state_load(buf, buflen);
chm->hashes[i] = state;
free(buf);
}
DEBUGP("Reading m and n\n");
nbytes = fread(&(chm->n), sizeof(cmph_uint32), (size_t)1, f);
nbytes = fread(&(chm->m), sizeof(cmph_uint32), (size_t)1, f);
chm->g = (cmph_uint32 *)malloc(sizeof(cmph_uint32)*chm->n);
nbytes = fread(chm->g, chm->n*sizeof(cmph_uint32), (size_t)1, f);
if (nbytes == 0 && ferror(f)) {
fprintf(stderr, "ERROR: %s\n", strerror(errno));
return;
}
#ifdef DEBUG
fprintf(stderr, "G: ");
for (i = 0; i < chm->n; ++i) fprintf(stderr, "%u ", chm->g[i]);
fprintf(stderr, "\n");
#endif
return;
}
cmph_uint32 chm_search(cmph_t *mphf, const char *key, cmph_uint32 keylen)
{
chm_data_t *chm = mphf->data;
cmph_uint32 h1 = hash(chm->hashes[0], key, keylen) % chm->n;
cmph_uint32 h2 = hash(chm->hashes[1], key, keylen) % chm->n;
DEBUGP("key: %s h1: %u h2: %u\n", key, h1, h2);
if (h1 == h2 && ++h2 >= chm->n) h2 = 0;
DEBUGP("key: %s g[h1]: %u g[h2]: %u edges: %u\n", key, chm->g[h1], chm->g[h2], chm->m);
return (chm->g[h1] + chm->g[h2]) % chm->m;
}
void chm_destroy(cmph_t *mphf)
{
chm_data_t *data = (chm_data_t *)mphf->data;
free(data->g);
hash_state_destroy(data->hashes[0]);
hash_state_destroy(data->hashes[1]);
free(data->hashes);
free(data);
free(mphf);
}
/** \fn void chm_pack(cmph_t *mphf, void *packed_mphf);
* \brief Support the ability to pack a perfect hash function into a preallocated contiguous memory space pointed by packed_mphf.
* \param mphf pointer to the resulting mphf
* \param packed_mphf pointer to the contiguous memory area used to store the resulting mphf. The size of packed_mphf must be at least cmph_packed_size()
*/
void chm_pack(cmph_t *mphf, void *packed_mphf)
{
chm_data_t *data = (chm_data_t *)mphf->data;
cmph_uint8 * ptr = packed_mphf;
CMPH_HASH h2_type;
// packing h1 type
CMPH_HASH h1_type = hash_get_type(data->hashes[0]);
*((cmph_uint32 *) ptr) = h1_type;
ptr += sizeof(cmph_uint32);
// packing h1
hash_state_pack(data->hashes[0], ptr);
ptr += hash_state_packed_size(h1_type);
// packing h2 type
h2_type = hash_get_type(data->hashes[1]);
*((cmph_uint32 *) ptr) = h2_type;
ptr += sizeof(cmph_uint32);
// packing h2
hash_state_pack(data->hashes[1], ptr);
ptr += hash_state_packed_size(h2_type);
// packing n
*((cmph_uint32 *) ptr) = data->n;
ptr += sizeof(data->n);
// packing m
*((cmph_uint32 *) ptr) = data->m;
ptr += sizeof(data->m);
// packing g
memcpy(ptr, data->g, sizeof(cmph_uint32)*data->n);
}
/** \fn cmph_uint32 chm_packed_size(cmph_t *mphf);
* \brief Return the amount of space needed to pack mphf.
* \param mphf pointer to a mphf
* \return the size of the packed function or zero for failures
*/
cmph_uint32 chm_packed_size(cmph_t *mphf)
{
chm_data_t *data = (chm_data_t *)mphf->data;
CMPH_HASH h1_type = hash_get_type(data->hashes[0]);
CMPH_HASH h2_type = hash_get_type(data->hashes[1]);
return (cmph_uint32)(sizeof(CMPH_ALGO) + hash_state_packed_size(h1_type) + hash_state_packed_size(h2_type) +
4*sizeof(cmph_uint32) + sizeof(cmph_uint32)*data->n);
}
/** cmph_uint32 chm_search(void *packed_mphf, const char *key, cmph_uint32 keylen);
* \brief Use the packed mphf to do a search.
* \param packed_mphf pointer to the packed mphf
* \param key key to be hashed
* \param keylen key legth in bytes
* \return The mphf value
*/
cmph_uint32 chm_search_packed(void *packed_mphf, const char *key, cmph_uint32 keylen)
{
register cmph_uint8 *h1_ptr = packed_mphf;
register CMPH_HASH h1_type = *((cmph_uint32 *)h1_ptr);
register cmph_uint8 *h2_ptr;
register CMPH_HASH h2_type;
register cmph_uint32 *g_ptr;
register cmph_uint32 n, m, h1, h2;
h1_ptr += 4;
h2_ptr = h1_ptr + hash_state_packed_size(h1_type);
h2_type = *((cmph_uint32 *)h2_ptr);
h2_ptr += 4;
g_ptr = (cmph_uint32 *)(h2_ptr + hash_state_packed_size(h2_type));
n = *g_ptr++;
m = *g_ptr++;
h1 = hash_packed(h1_ptr, h1_type, key, keylen) % n;
h2 = hash_packed(h2_ptr, h2_type, key, keylen) % n;
DEBUGP("key: %s h1: %u h2: %u\n", key, h1, h2);
if (h1 == h2 && ++h2 >= n) h2 = 0;
DEBUGP("key: %s g[h1]: %u g[h2]: %u edges: %u\n", key, g_ptr[h1], g_ptr[h2], m);
return (g_ptr[h1] + g_ptr[h2]) % m;
}

42
girepository/cmph/chm.h Normal file
View File

@ -0,0 +1,42 @@
#ifndef __CMPH_CHM_H__
#define __CMPH_CHM_H__
#include "cmph.h"
typedef struct __chm_data_t chm_data_t;
typedef struct __chm_config_data_t chm_config_data_t;
chm_config_data_t *chm_config_new(void);
void chm_config_set_hashfuncs(cmph_config_t *mph, CMPH_HASH *hashfuncs);
void chm_config_destroy(cmph_config_t *mph);
cmph_t *chm_new(cmph_config_t *mph, double c);
void chm_load(FILE *f, cmph_t *mphf);
int chm_dump(cmph_t *mphf, FILE *f);
void chm_destroy(cmph_t *mphf);
cmph_uint32 chm_search(cmph_t *mphf, const char *key, cmph_uint32 keylen);
/** \fn void chm_pack(cmph_t *mphf, void *packed_mphf);
* \brief Support the ability to pack a perfect hash function into a preallocated contiguous memory space pointed by packed_mphf.
* \param mphf pointer to the resulting mphf
* \param packed_mphf pointer to the contiguous memory area used to store the resulting mphf. The size of packed_mphf must be at least cmph_packed_size()
*/
void chm_pack(cmph_t *mphf, void *packed_mphf);
/** \fn cmph_uint32 chm_packed_size(cmph_t *mphf);
* \brief Return the amount of space needed to pack mphf.
* \param mphf pointer to a mphf
* \return the size of the packed function or zero for failures
*/
cmph_uint32 chm_packed_size(cmph_t *mphf);
/** cmph_uint32 chm_search(void *packed_mphf, const char *key, cmph_uint32 keylen);
* \brief Use the packed mphf to do a search.
* \param packed_mphf pointer to the packed mphf
* \param key key to be hashed
* \param keylen key legth in bytes
* \return The mphf value
*/
cmph_uint32 chm_search_packed(void *packed_mphf, const char *key, cmph_uint32 keylen);
#endif

View File

@ -0,0 +1,24 @@
#ifndef __CMPH_CHM_STRUCTS_H__
#define __CMPH_CHM_STRUCTS_H__
#include "hash_state.h"
struct __chm_data_t
{
cmph_uint32 m; //edges (words) count
cmph_uint32 n; //vertex count
cmph_uint32 *g;
hash_state_t **hashes;
};
struct __chm_config_data_t
{
CMPH_HASH hashfuncs[2];
cmph_uint32 m; //edges (words) count
cmph_uint32 n; //vertex count
graph_t *graph;
cmph_uint32 *g;
hash_state_t **hashes;
};
#endif

844
girepository/cmph/cmph.c Normal file
View File

@ -0,0 +1,844 @@
#include "cmph.h"
#include "cmph_structs.h"
#include "chm.h"
#include "bmz.h"
#include "bmz8.h"
#include "brz.h"
#include "fch.h"
#include "bdz.h"
#include "bdz_ph.h"
#include "chd_ph.h"
#include "chd.h"
#include <stdlib.h>
#include <assert.h>
#include <string.h>
//#define DEBUG
#include "debug.h"
const char *cmph_names[] = {"bmz", "bmz8", "chm", "brz", "fch", "bdz", "bdz_ph", "chd_ph", "chd", NULL };
typedef struct
{
void *vector;
cmph_uint32 position; // access position when data is a vector
} cmph_vector_t;
/**
* Support a vector of struct as the source of keys.
*
* E.g. The keys could be the fieldB's in a vector of struct rec where
* struct rec is defined as:
* struct rec {
* fieldA;
* fieldB;
* fieldC;
* }
*/
typedef struct
{
void *vector; /* Pointer to the vector of struct */
cmph_uint32 position; /* current position */
cmph_uint32 struct_size; /* The size of the struct */
cmph_uint32 key_offset; /* The byte offset of the key in the struct */
cmph_uint32 key_len; /* The length of the key */
} cmph_struct_vector_t;
static cmph_io_adapter_t *cmph_io_vector_new(void * vector, cmph_uint32 nkeys);
static void cmph_io_vector_destroy(cmph_io_adapter_t * key_source);
static cmph_io_adapter_t *cmph_io_struct_vector_new(void * vector, cmph_uint32 struct_size, cmph_uint32 key_offset, cmph_uint32 key_len, cmph_uint32 nkeys);
static void cmph_io_struct_vector_destroy(cmph_io_adapter_t * key_source);
static int key_nlfile_read(void *data, char **key, cmph_uint32 *keylen)
{
FILE *fd = (FILE *)data;
*key = NULL;
*keylen = 0;
while(1)
{
char buf[BUFSIZ];
char *c = fgets(buf, BUFSIZ, fd);
if (c == NULL) return -1;
if (feof(fd)) return -1;
*key = (char *)realloc(*key, *keylen + strlen(buf) + 1);
memcpy(*key + *keylen, buf, strlen(buf));
*keylen += (cmph_uint32)strlen(buf);
if (buf[strlen(buf) - 1] != '\n') continue;
break;
}
if ((*keylen) && (*key)[*keylen - 1] == '\n')
{
(*key)[(*keylen) - 1] = 0;
--(*keylen);
}
return (int)(*keylen);
}
static int key_byte_vector_read(void *data, char **key, cmph_uint32 *keylen)
{
cmph_vector_t *cmph_vector = (cmph_vector_t *)data;
cmph_uint8 **keys_vd = (cmph_uint8 **)cmph_vector->vector;
size_t size;
memcpy(keylen, keys_vd[cmph_vector->position], sizeof(*keylen));
size = *keylen;
*key = (char *)malloc(size);
memcpy(*key, keys_vd[cmph_vector->position] + sizeof(*keylen), size);
cmph_vector->position = cmph_vector->position + 1;
return (int)(*keylen);
}
static int key_struct_vector_read(void *data, char **key, cmph_uint32 *keylen)
{
cmph_struct_vector_t *cmph_struct_vector = (cmph_struct_vector_t *)data;
char *keys_vd = (char *)cmph_struct_vector->vector;
size_t size;
*keylen = cmph_struct_vector->key_len;
size = *keylen;
*key = (char *)malloc(size);
memcpy(*key, (keys_vd + (cmph_struct_vector->position * cmph_struct_vector->struct_size) + cmph_struct_vector->key_offset), size);
cmph_struct_vector->position = cmph_struct_vector->position + 1;
return (int)(*keylen);
}
static int key_vector_read(void *data, char **key, cmph_uint32 *keylen)
{
cmph_vector_t *cmph_vector = (cmph_vector_t *)data;
char **keys_vd = (char **)cmph_vector->vector;
size_t size;
*keylen = (cmph_uint32)strlen(keys_vd[cmph_vector->position]);
size = *keylen;
*key = (char *)malloc(size + 1);
strcpy(*key, keys_vd[cmph_vector->position]);
cmph_vector->position = cmph_vector->position + 1;
return (int)(*keylen);
}
static void key_nlfile_dispose(void *data, char *key, cmph_uint32 keylen)
{
free(key);
}
static void key_vector_dispose(void *data, char *key, cmph_uint32 keylen)
{
free(key);
}
static void key_nlfile_rewind(void *data)
{
FILE *fd = (FILE *)data;
rewind(fd);
}
static void key_struct_vector_rewind(void *data)
{
cmph_struct_vector_t *cmph_struct_vector = (cmph_struct_vector_t *)data;
cmph_struct_vector->position = 0;
}
static void key_vector_rewind(void *data)
{
cmph_vector_t *cmph_vector = (cmph_vector_t *)data;
cmph_vector->position = 0;
}
static cmph_uint32 count_nlfile_keys(FILE *fd)
{
cmph_uint32 count = 0;
rewind(fd);
while(1)
{
char buf[BUFSIZ];
if (fgets(buf, BUFSIZ, fd) == NULL) break;
if (feof(fd)) break;
if (buf[strlen(buf) - 1] != '\n') continue;
++count;
}
rewind(fd);
return count;
}
cmph_io_adapter_t *cmph_io_nlfile_adapter(FILE * keys_fd)
{
cmph_io_adapter_t * key_source = (cmph_io_adapter_t *)malloc(sizeof(cmph_io_adapter_t));
assert(key_source);
key_source->data = (void *)keys_fd;
key_source->nkeys = count_nlfile_keys(keys_fd);
key_source->read = key_nlfile_read;
key_source->dispose = key_nlfile_dispose;
key_source->rewind = key_nlfile_rewind;
return key_source;
}
void cmph_io_nlfile_adapter_destroy(cmph_io_adapter_t * key_source)
{
free(key_source);
}
cmph_io_adapter_t *cmph_io_nlnkfile_adapter(FILE * keys_fd, cmph_uint32 nkeys)
{
cmph_io_adapter_t * key_source = (cmph_io_adapter_t *)malloc(sizeof(cmph_io_adapter_t));
assert(key_source);
key_source->data = (void *)keys_fd;
key_source->nkeys = nkeys;
key_source->read = key_nlfile_read;
key_source->dispose = key_nlfile_dispose;
key_source->rewind = key_nlfile_rewind;
return key_source;
}
void cmph_io_nlnkfile_adapter_destroy(cmph_io_adapter_t * key_source)
{
free(key_source);
}
static cmph_io_adapter_t *cmph_io_struct_vector_new(void * vector, cmph_uint32 struct_size, cmph_uint32 key_offset, cmph_uint32 key_len, cmph_uint32 nkeys)
{
cmph_io_adapter_t * key_source = (cmph_io_adapter_t *)malloc(sizeof(cmph_io_adapter_t));
cmph_struct_vector_t * cmph_struct_vector = (cmph_struct_vector_t *)malloc(sizeof(cmph_struct_vector_t));
assert(key_source);
assert(cmph_struct_vector);
cmph_struct_vector->vector = vector;
cmph_struct_vector->position = 0;
cmph_struct_vector->struct_size = struct_size;
cmph_struct_vector->key_offset = key_offset;
cmph_struct_vector->key_len = key_len;
key_source->data = (void *)cmph_struct_vector;
key_source->nkeys = nkeys;
return key_source;
}
static void cmph_io_struct_vector_destroy(cmph_io_adapter_t * key_source)
{
cmph_struct_vector_t *cmph_struct_vector = (cmph_struct_vector_t *)key_source->data;
cmph_struct_vector->vector = NULL;
free(cmph_struct_vector);
free(key_source);
}
static cmph_io_adapter_t *cmph_io_vector_new(void * vector, cmph_uint32 nkeys)
{
cmph_io_adapter_t * key_source = (cmph_io_adapter_t *)malloc(sizeof(cmph_io_adapter_t));
cmph_vector_t * cmph_vector = (cmph_vector_t *)malloc(sizeof(cmph_vector_t));
assert(key_source);
assert(cmph_vector);
cmph_vector->vector = vector;
cmph_vector->position = 0;
key_source->data = (void *)cmph_vector;
key_source->nkeys = nkeys;
return key_source;
}
static void cmph_io_vector_destroy(cmph_io_adapter_t * key_source)
{
cmph_vector_t *cmph_vector = (cmph_vector_t *)key_source->data;
cmph_vector->vector = NULL;
free(cmph_vector);
free(key_source);
}
cmph_io_adapter_t *cmph_io_byte_vector_adapter(cmph_uint8 ** vector, cmph_uint32 nkeys)
{
cmph_io_adapter_t * key_source = cmph_io_vector_new(vector, nkeys);
key_source->read = key_byte_vector_read;
key_source->dispose = key_vector_dispose;
key_source->rewind = key_vector_rewind;
return key_source;
}
void cmph_io_byte_vector_adapter_destroy(cmph_io_adapter_t * key_source)
{
cmph_io_vector_destroy(key_source);
}
cmph_io_adapter_t *cmph_io_struct_vector_adapter(void * vector, cmph_uint32 struct_size, cmph_uint32 key_offset, cmph_uint32 key_len, cmph_uint32 nkeys)
{
cmph_io_adapter_t * key_source = cmph_io_struct_vector_new(vector, struct_size, key_offset, key_len, nkeys);
key_source->read = key_struct_vector_read;
key_source->dispose = key_vector_dispose;
key_source->rewind = key_struct_vector_rewind;
return key_source;
}
void cmph_io_struct_vector_adapter_destroy(cmph_io_adapter_t * key_source)
{
cmph_io_struct_vector_destroy(key_source);
}
cmph_io_adapter_t *cmph_io_vector_adapter(char ** vector, cmph_uint32 nkeys)
{
cmph_io_adapter_t * key_source = cmph_io_vector_new(vector, nkeys);
key_source->read = key_vector_read;
key_source->dispose = key_vector_dispose;
key_source->rewind = key_vector_rewind;
return key_source;
}
void cmph_io_vector_adapter_destroy(cmph_io_adapter_t * key_source)
{
cmph_io_vector_destroy(key_source);
}
cmph_config_t *cmph_config_new(cmph_io_adapter_t *key_source)
{
cmph_config_t *mph = NULL;
mph = __config_new(key_source);
assert(mph);
mph->algo = CMPH_CHM; // default value
mph->data = chm_config_new();
return mph;
}
void cmph_config_set_algo(cmph_config_t *mph, CMPH_ALGO algo)
{
if (algo != mph->algo)
{
switch (mph->algo)
{
case CMPH_CHM:
chm_config_destroy(mph);
break;
case CMPH_BMZ:
bmz_config_destroy(mph);
break;
case CMPH_BMZ8:
bmz8_config_destroy(mph);
break;
case CMPH_BRZ:
brz_config_destroy(mph);
break;
case CMPH_FCH:
fch_config_destroy(mph);
break;
case CMPH_BDZ:
bdz_config_destroy(mph);
break;
case CMPH_BDZ_PH:
bdz_ph_config_destroy(mph);
break;
case CMPH_CHD_PH:
chd_ph_config_destroy(mph);
break;
case CMPH_CHD:
chd_config_destroy(mph);
break;
default:
assert(0);
}
switch(algo)
{
case CMPH_CHM:
mph->data = chm_config_new();
break;
case CMPH_BMZ:
mph->data = bmz_config_new();
break;
case CMPH_BMZ8:
mph->data = bmz8_config_new();
break;
case CMPH_BRZ:
mph->data = brz_config_new();
break;
case CMPH_FCH:
mph->data = fch_config_new();
break;
case CMPH_BDZ:
mph->data = bdz_config_new();
break;
case CMPH_BDZ_PH:
mph->data = bdz_ph_config_new();
break;
case CMPH_CHD_PH:
mph->data = chd_ph_config_new();
break;
case CMPH_CHD:
mph->data = chd_config_new(mph);
break;
default:
assert(0);
}
}
mph->algo = algo;
}
void cmph_config_set_tmp_dir(cmph_config_t *mph, cmph_uint8 *tmp_dir)
{
if (mph->algo == CMPH_BRZ)
{
brz_config_set_tmp_dir(mph, tmp_dir);
}
}
void cmph_config_set_mphf_fd(cmph_config_t *mph, FILE *mphf_fd)
{
if (mph->algo == CMPH_BRZ)
{
brz_config_set_mphf_fd(mph, mphf_fd);
}
}
void cmph_config_set_b(cmph_config_t *mph, cmph_uint32 b)
{
if (mph->algo == CMPH_BRZ)
{
brz_config_set_b(mph, b);
}
else if (mph->algo == CMPH_BDZ)
{
bdz_config_set_b(mph, b);
}
else if (mph->algo == CMPH_CHD_PH)
{
chd_ph_config_set_b(mph, b);
}
else if (mph->algo == CMPH_CHD)
{
chd_config_set_b(mph, b);
}
}
void cmph_config_set_keys_per_bin(cmph_config_t *mph, cmph_uint32 keys_per_bin)
{
if (mph->algo == CMPH_CHD_PH)
{
chd_ph_config_set_keys_per_bin(mph, keys_per_bin);
}
else if (mph->algo == CMPH_CHD)
{
chd_config_set_keys_per_bin(mph, keys_per_bin);
}
}
void cmph_config_set_memory_availability(cmph_config_t *mph, cmph_uint32 memory_availability)
{
if (mph->algo == CMPH_BRZ)
{
brz_config_set_memory_availability(mph, memory_availability);
}
}
void cmph_config_destroy(cmph_config_t *mph)
{
if(mph)
{
DEBUGP("Destroying mph with algo %s\n", cmph_names[mph->algo]);
switch (mph->algo)
{
case CMPH_CHM:
chm_config_destroy(mph);
break;
case CMPH_BMZ: /* included -- Fabiano */
bmz_config_destroy(mph);
break;
case CMPH_BMZ8: /* included -- Fabiano */
bmz8_config_destroy(mph);
break;
case CMPH_BRZ: /* included -- Fabiano */
brz_config_destroy(mph);
break;
case CMPH_FCH: /* included -- Fabiano */
fch_config_destroy(mph);
break;
case CMPH_BDZ: /* included -- Fabiano */
bdz_config_destroy(mph);
break;
case CMPH_BDZ_PH: /* included -- Fabiano */
bdz_ph_config_destroy(mph);
break;
case CMPH_CHD_PH: /* included -- Fabiano */
chd_ph_config_destroy(mph);
break;
case CMPH_CHD: /* included -- Fabiano */
chd_config_destroy(mph);
break;
default:
assert(0);
}
__config_destroy(mph);
}
}
void cmph_config_set_verbosity(cmph_config_t *mph, cmph_uint32 verbosity)
{
mph->verbosity = verbosity;
}
void cmph_config_set_hashfuncs(cmph_config_t *mph, CMPH_HASH *hashfuncs)
{
switch (mph->algo)
{
case CMPH_CHM:
chm_config_set_hashfuncs(mph, hashfuncs);
break;
case CMPH_BMZ: /* included -- Fabiano */
bmz_config_set_hashfuncs(mph, hashfuncs);
break;
case CMPH_BMZ8: /* included -- Fabiano */
bmz8_config_set_hashfuncs(mph, hashfuncs);
break;
case CMPH_BRZ: /* included -- Fabiano */
brz_config_set_hashfuncs(mph, hashfuncs);
break;
case CMPH_FCH: /* included -- Fabiano */
fch_config_set_hashfuncs(mph, hashfuncs);
break;
case CMPH_BDZ: /* included -- Fabiano */
bdz_config_set_hashfuncs(mph, hashfuncs);
break;
case CMPH_BDZ_PH: /* included -- Fabiano */
bdz_ph_config_set_hashfuncs(mph, hashfuncs);
break;
case CMPH_CHD_PH: /* included -- Fabiano */
chd_ph_config_set_hashfuncs(mph, hashfuncs);
break;
case CMPH_CHD: /* included -- Fabiano */
chd_config_set_hashfuncs(mph, hashfuncs);
break;
default:
break;
}
return;
}
void cmph_config_set_graphsize(cmph_config_t *mph, double c)
{
mph->c = c;
return;
}
cmph_t *cmph_new(cmph_config_t *mph)
{
cmph_t *mphf = NULL;
double c = mph->c;
DEBUGP("Creating mph with algorithm %s\n", cmph_names[mph->algo]);
switch (mph->algo)
{
case CMPH_CHM:
DEBUGP("Creating chm hash\n");
mphf = chm_new(mph, c);
break;
case CMPH_BMZ: /* included -- Fabiano */
DEBUGP("Creating bmz hash\n");
mphf = bmz_new(mph, c);
break;
case CMPH_BMZ8: /* included -- Fabiano */
DEBUGP("Creating bmz8 hash\n");
mphf = bmz8_new(mph, c);
break;
case CMPH_BRZ: /* included -- Fabiano */
DEBUGP("Creating brz hash\n");
if (c >= 2.0) brz_config_set_algo(mph, CMPH_FCH);
else brz_config_set_algo(mph, CMPH_BMZ8);
mphf = brz_new(mph, c);
break;
case CMPH_FCH: /* included -- Fabiano */
DEBUGP("Creating fch hash\n");
mphf = fch_new(mph, c);
break;
case CMPH_BDZ: /* included -- Fabiano */
DEBUGP("Creating bdz hash\n");
mphf = bdz_new(mph, c);
break;
case CMPH_BDZ_PH: /* included -- Fabiano */
DEBUGP("Creating bdz_ph hash\n");
mphf = bdz_ph_new(mph, c);
break;
case CMPH_CHD_PH: /* included -- Fabiano */
DEBUGP("Creating chd_ph hash\n");
mphf = chd_ph_new(mph, c);
break;
case CMPH_CHD: /* included -- Fabiano */
DEBUGP("Creating chd hash\n");
mphf = chd_new(mph, c);
break;
default:
assert(0);
}
return mphf;
}
int cmph_dump(cmph_t *mphf, FILE *f)
{
switch (mphf->algo)
{
case CMPH_CHM:
return chm_dump(mphf, f);
case CMPH_BMZ: /* included -- Fabiano */
return bmz_dump(mphf, f);
case CMPH_BMZ8: /* included -- Fabiano */
return bmz8_dump(mphf, f);
case CMPH_BRZ: /* included -- Fabiano */
return brz_dump(mphf, f);
case CMPH_FCH: /* included -- Fabiano */
return fch_dump(mphf, f);
case CMPH_BDZ: /* included -- Fabiano */
return bdz_dump(mphf, f);
case CMPH_BDZ_PH: /* included -- Fabiano */
return bdz_ph_dump(mphf, f);
case CMPH_CHD_PH: /* included -- Fabiano */
return chd_ph_dump(mphf, f);
case CMPH_CHD: /* included -- Fabiano */
return chd_dump(mphf, f);
default:
assert(0);
}
assert(0);
return 0;
}
cmph_t *cmph_load(FILE *f)
{
cmph_t *mphf = NULL;
DEBUGP("Loading mphf generic parts\n");
mphf = __cmph_load(f);
if (mphf == NULL) return NULL;
DEBUGP("Loading mphf algorithm dependent parts\n");
switch (mphf->algo)
{
case CMPH_CHM:
chm_load(f, mphf);
break;
case CMPH_BMZ: /* included -- Fabiano */
DEBUGP("Loading bmz algorithm dependent parts\n");
bmz_load(f, mphf);
break;
case CMPH_BMZ8: /* included -- Fabiano */
DEBUGP("Loading bmz8 algorithm dependent parts\n");
bmz8_load(f, mphf);
break;
case CMPH_BRZ: /* included -- Fabiano */
DEBUGP("Loading brz algorithm dependent parts\n");
brz_load(f, mphf);
break;
case CMPH_FCH: /* included -- Fabiano */
DEBUGP("Loading fch algorithm dependent parts\n");
fch_load(f, mphf);
break;
case CMPH_BDZ: /* included -- Fabiano */
DEBUGP("Loading bdz algorithm dependent parts\n");
bdz_load(f, mphf);
break;
case CMPH_BDZ_PH: /* included -- Fabiano */
DEBUGP("Loading bdz_ph algorithm dependent parts\n");
bdz_ph_load(f, mphf);
break;
case CMPH_CHD_PH: /* included -- Fabiano */
DEBUGP("Loading chd_ph algorithm dependent parts\n");
chd_ph_load(f, mphf);
break;
case CMPH_CHD: /* included -- Fabiano */
DEBUGP("Loading chd algorithm dependent parts\n");
chd_load(f, mphf);
break;
default:
assert(0);
}
DEBUGP("Loaded mphf\n");
return mphf;
}
cmph_uint32 cmph_search(cmph_t *mphf, const char *key, cmph_uint32 keylen)
{
DEBUGP("mphf algorithm: %u \n", mphf->algo);
switch(mphf->algo)
{
case CMPH_CHM:
return chm_search(mphf, key, keylen);
case CMPH_BMZ: /* included -- Fabiano */
DEBUGP("bmz algorithm search\n");
return bmz_search(mphf, key, keylen);
case CMPH_BMZ8: /* included -- Fabiano */
DEBUGP("bmz8 algorithm search\n");
return bmz8_search(mphf, key, keylen);
case CMPH_BRZ: /* included -- Fabiano */
DEBUGP("brz algorithm search\n");
return brz_search(mphf, key, keylen);
case CMPH_FCH: /* included -- Fabiano */
DEBUGP("fch algorithm search\n");
return fch_search(mphf, key, keylen);
case CMPH_BDZ: /* included -- Fabiano */
DEBUGP("bdz algorithm search\n");
return bdz_search(mphf, key, keylen);
case CMPH_BDZ_PH: /* included -- Fabiano */
DEBUGP("bdz_ph algorithm search\n");
return bdz_ph_search(mphf, key, keylen);
case CMPH_CHD_PH: /* included -- Fabiano */
DEBUGP("chd_ph algorithm search\n");
return chd_ph_search(mphf, key, keylen);
case CMPH_CHD: /* included -- Fabiano */
DEBUGP("chd algorithm search\n");
return chd_search(mphf, key, keylen);
default:
assert(0);
}
assert(0);
return 0;
}
cmph_uint32 cmph_size(cmph_t *mphf)
{
return mphf->size;
}
void cmph_destroy(cmph_t *mphf)
{
switch(mphf->algo)
{
case CMPH_CHM:
chm_destroy(mphf);
return;
case CMPH_BMZ: /* included -- Fabiano */
bmz_destroy(mphf);
return;
case CMPH_BMZ8: /* included -- Fabiano */
bmz8_destroy(mphf);
return;
case CMPH_BRZ: /* included -- Fabiano */
brz_destroy(mphf);
return;
case CMPH_FCH: /* included -- Fabiano */
fch_destroy(mphf);
return;
case CMPH_BDZ: /* included -- Fabiano */
bdz_destroy(mphf);
return;
case CMPH_BDZ_PH: /* included -- Fabiano */
bdz_ph_destroy(mphf);
return;
case CMPH_CHD_PH: /* included -- Fabiano */
chd_ph_destroy(mphf);
return;
case CMPH_CHD: /* included -- Fabiano */
chd_destroy(mphf);
return;
default:
assert(0);
}
assert(0);
return;
}
/** \fn void cmph_pack(cmph_t *mphf, void *packed_mphf);
* \brief Support the ability to pack a perfect hash function into a preallocated contiguous memory space pointed by packed_mphf.
* \param mphf pointer to the resulting mphf
* \param packed_mphf pointer to the contiguous memory area used to store the resulting mphf. The size of packed_mphf must be at least cmph_packed_size()
*/
void cmph_pack(cmph_t *mphf, void *packed_mphf)
{
// packing algorithm type to be used in cmph.c
cmph_uint32 * ptr = (cmph_uint32 *) packed_mphf;
*ptr++ = mphf->algo;
DEBUGP("mphf->algo = %u\n", mphf->algo);
switch(mphf->algo)
{
case CMPH_CHM:
chm_pack(mphf, ptr);
break;
case CMPH_BMZ: /* included -- Fabiano */
bmz_pack(mphf, ptr);
break;
case CMPH_BMZ8: /* included -- Fabiano */
bmz8_pack(mphf, ptr);
break;
case CMPH_BRZ: /* included -- Fabiano */
brz_pack(mphf, ptr);
break;
case CMPH_FCH: /* included -- Fabiano */
fch_pack(mphf, ptr);
break;
case CMPH_BDZ: /* included -- Fabiano */
bdz_pack(mphf, ptr);
break;
case CMPH_BDZ_PH: /* included -- Fabiano */
bdz_ph_pack(mphf, ptr);
break;
case CMPH_CHD_PH: /* included -- Fabiano */
chd_ph_pack(mphf, ptr);
break;
case CMPH_CHD: /* included -- Fabiano */
chd_pack(mphf, ptr);
break;
default:
assert(0);
}
return;
}
/** \fn cmph_uint32 cmph_packed_size(cmph_t *mphf);
* \brief Return the amount of space needed to pack mphf.
* \param mphf pointer to a mphf
* \return the size of the packed function or zero for failures
*/
cmph_uint32 cmph_packed_size(cmph_t *mphf)
{
switch(mphf->algo)
{
case CMPH_CHM:
return chm_packed_size(mphf);
case CMPH_BMZ: /* included -- Fabiano */
return bmz_packed_size(mphf);
case CMPH_BMZ8: /* included -- Fabiano */
return bmz8_packed_size(mphf);
case CMPH_BRZ: /* included -- Fabiano */
return brz_packed_size(mphf);
case CMPH_FCH: /* included -- Fabiano */
return fch_packed_size(mphf);
case CMPH_BDZ: /* included -- Fabiano */
return bdz_packed_size(mphf);
case CMPH_BDZ_PH: /* included -- Fabiano */
return bdz_ph_packed_size(mphf);
case CMPH_CHD_PH: /* included -- Fabiano */
return chd_ph_packed_size(mphf);
case CMPH_CHD: /* included -- Fabiano */
return chd_packed_size(mphf);
default:
assert(0);
}
return 0; // FAILURE
}
/** cmph_uint32 cmph_search(void *packed_mphf, const char *key, cmph_uint32 keylen);
* \brief Use the packed mphf to do a search.
* \param packed_mphf pointer to the packed mphf
* \param key key to be hashed
* \param keylen key legth in bytes
* \return The mphf value
*/
cmph_uint32 cmph_search_packed(void *packed_mphf, const char *key, cmph_uint32 keylen)
{
cmph_uint32 *ptr = (cmph_uint32 *)packed_mphf;
// fprintf(stderr, "algo:%u\n", *ptr);
switch(*ptr)
{
case CMPH_CHM:
return chm_search_packed(++ptr, key, keylen);
case CMPH_BMZ: /* included -- Fabiano */
return bmz_search_packed(++ptr, key, keylen);
case CMPH_BMZ8: /* included -- Fabiano */
return bmz8_search_packed(++ptr, key, keylen);
case CMPH_BRZ: /* included -- Fabiano */
return brz_search_packed(++ptr, key, keylen);
case CMPH_FCH: /* included -- Fabiano */
return fch_search_packed(++ptr, key, keylen);
case CMPH_BDZ: /* included -- Fabiano */
return bdz_search_packed(++ptr, key, keylen);
case CMPH_BDZ_PH: /* included -- Fabiano */
return bdz_ph_search_packed(++ptr, key, keylen);
case CMPH_CHD_PH: /* included -- Fabiano */
return chd_ph_search_packed(++ptr, key, keylen);
case CMPH_CHD: /* included -- Fabiano */
return chd_search_packed(++ptr, key, keylen);
default:
assert(0);
}
return 0; // FAILURE
}

112
girepository/cmph/cmph.h Normal file
View File

@ -0,0 +1,112 @@
#ifndef __CMPH_H__
#define __CMPH_H__
#include <stdlib.h>
#include <stdio.h>
#ifdef __cplusplus
extern "C"
{
#endif
#include "cmph_types.h"
typedef struct __config_t cmph_config_t;
typedef struct __cmph_t cmph_t;
typedef struct
{
void *data;
cmph_uint32 nkeys;
int (*read)(void *, char **, cmph_uint32 *);
void (*dispose)(void *, char *, cmph_uint32);
void (*rewind)(void *);
} cmph_io_adapter_t;
/** Adapter pattern API **/
/* please call free() in the created adapters */
cmph_io_adapter_t *cmph_io_nlfile_adapter(FILE * keys_fd);
void cmph_io_nlfile_adapter_destroy(cmph_io_adapter_t * key_source);
cmph_io_adapter_t *cmph_io_nlnkfile_adapter(FILE * keys_fd, cmph_uint32 nkeys);
void cmph_io_nlnkfile_adapter_destroy(cmph_io_adapter_t * key_source);
cmph_io_adapter_t *cmph_io_vector_adapter(char ** vector, cmph_uint32 nkeys);
void cmph_io_vector_adapter_destroy(cmph_io_adapter_t * key_source);
cmph_io_adapter_t *cmph_io_byte_vector_adapter(cmph_uint8 ** vector, cmph_uint32 nkeys);
void cmph_io_byte_vector_adapter_destroy(cmph_io_adapter_t * key_source);
cmph_io_adapter_t *cmph_io_struct_vector_adapter(void * vector,
cmph_uint32 struct_size,
cmph_uint32 key_offset,
cmph_uint32 key_len,
cmph_uint32 nkeys);
void cmph_io_struct_vector_adapter_destroy(cmph_io_adapter_t * key_source);
/** Hash configuration API **/
cmph_config_t *cmph_config_new(cmph_io_adapter_t *key_source);
void cmph_config_set_hashfuncs(cmph_config_t *mph, CMPH_HASH *hashfuncs);
void cmph_config_set_verbosity(cmph_config_t *mph, cmph_uint32 verbosity);
void cmph_config_set_graphsize(cmph_config_t *mph, double c);
void cmph_config_set_algo(cmph_config_t *mph, CMPH_ALGO algo);
void cmph_config_set_tmp_dir(cmph_config_t *mph, cmph_uint8 *tmp_dir);
void cmph_config_set_mphf_fd(cmph_config_t *mph, FILE *mphf_fd);
void cmph_config_set_b(cmph_config_t *mph, cmph_uint32 b);
void cmph_config_set_keys_per_bin(cmph_config_t *mph, cmph_uint32 keys_per_bin);
void cmph_config_set_memory_availability(cmph_config_t *mph, cmph_uint32 memory_availability);
void cmph_config_destroy(cmph_config_t *mph);
/** Hash API **/
cmph_t *cmph_new(cmph_config_t *mph);
/** cmph_uint32 cmph_search(cmph_t *mphf, const char *key, cmph_uint32 keylen);
* \brief Computes the mphf value.
* \param mphf pointer to the resulting function
* \param key is the key to be hashed
* \param keylen is the key legth in bytes
* \return The mphf value
*/
cmph_uint32 cmph_search(cmph_t *mphf, const char *key, cmph_uint32 keylen);
cmph_uint32 cmph_size(cmph_t *mphf);
void cmph_destroy(cmph_t *mphf);
/** Hash serialization/deserialization */
int cmph_dump(cmph_t *mphf, FILE *f);
cmph_t *cmph_load(FILE *f);
/** \fn void cmph_pack(cmph_t *mphf, void *packed_mphf);
* \brief Support the ability to pack a perfect hash function into a preallocated contiguous memory space pointed by packed_mphf.
* \param mphf pointer to the resulting mphf
* \param packed_mphf pointer to the contiguous memory area used to store the
* \param resulting mphf. The size of packed_mphf must be at least cmph_packed_size()
*/
void cmph_pack(cmph_t *mphf, void *packed_mphf);
/** \fn cmph_uint32 cmph_packed_size(cmph_t *mphf);
* \brief Return the amount of space needed to pack mphf.
* \param mphf pointer to a mphf
* \return the size of the packed function or zero for failures
*/
cmph_uint32 cmph_packed_size(cmph_t *mphf);
/** cmph_uint32 cmph_search(void *packed_mphf, const char *key, cmph_uint32 keylen);
* \brief Use the packed mphf to do a search.
* \param packed_mphf pointer to the packed mphf
* \param key key to be hashed
* \param keylen key legth in bytes
* \return The mphf value
*/
cmph_uint32 cmph_search_packed(void *packed_mphf, const char *key, cmph_uint32 keylen);
// TIMING functions. To use the macro CMPH_TIMING must be defined
#include "cmph_time.h"
#ifdef __cplusplus
}
#endif
#endif

View File

@ -0,0 +1,76 @@
#include "cmph_structs.h"
#include <string.h>
#include <errno.h>
//#define DEBUG
#include "debug.h"
cmph_config_t *__config_new(cmph_io_adapter_t *key_source)
{
cmph_config_t *mph = (cmph_config_t *)malloc(sizeof(cmph_config_t));
memset(mph, 0, sizeof(cmph_config_t));
if (mph == NULL) return NULL;
mph->key_source = key_source;
mph->verbosity = 0;
mph->data = NULL;
mph->c = 0;
return mph;
}
void __config_destroy(cmph_config_t *mph)
{
free(mph);
}
void __cmph_dump(cmph_t *mphf, FILE *fd)
{
register size_t nbytes;
nbytes = fwrite(cmph_names[mphf->algo], (size_t)(strlen(cmph_names[mphf->algo]) + 1), (size_t)1, fd);
nbytes = fwrite(&(mphf->size), sizeof(mphf->size), (size_t)1, fd);
if (nbytes == 0 && ferror(fd)) {
fprintf(stderr, "ERROR: %s\n", strerror(errno));
}
}
cmph_t *__cmph_load(FILE *f)
{
cmph_t *mphf = NULL;
cmph_uint32 i;
char algo_name[BUFSIZ];
char *ptr = algo_name;
CMPH_ALGO algo = CMPH_COUNT;
register size_t nbytes;
DEBUGP("Loading mphf\n");
while(1)
{
size_t c = fread(ptr, (size_t)1, (size_t)1, f);
if (c != 1) return NULL;
if (*ptr == 0) break;
++ptr;
}
for(i = 0; i < CMPH_COUNT; ++i)
{
if (strcmp(algo_name, cmph_names[i]) == 0)
{
algo = i;
}
}
if (algo == CMPH_COUNT)
{
DEBUGP("Algorithm %s not found\n", algo_name);
return NULL;
}
mphf = (cmph_t *)malloc(sizeof(cmph_t));
mphf->algo = algo;
nbytes = fread(&(mphf->size), sizeof(mphf->size), (size_t)1, f);
mphf->data = NULL;
DEBUGP("Algorithm is %s and mphf is sized %u\n", cmph_names[algo], mphf->size);
if (nbytes == 0 && ferror(f)) {
fprintf(stderr, "ERROR: %s\n", strerror(errno));
}
return mphf;
}

View File

@ -0,0 +1,33 @@
#ifndef __CMPH_STRUCTS_H__
#define __CMPH_STRUCTS_H__
#include "cmph.h"
/** Hash generation algorithm data
*/
struct __config_t
{
CMPH_ALGO algo;
cmph_io_adapter_t *key_source;
cmph_uint32 verbosity;
double c;
void *data; // algorithm dependent data
};
/** Hash querying algorithm data
*/
struct __cmph_t
{
CMPH_ALGO algo;
cmph_uint32 size;
cmph_io_adapter_t *key_source;
void *data; // algorithm dependent data
};
cmph_config_t *__config_new(cmph_io_adapter_t *key_source);
void __config_destroy(cmph_config_t*);
void __cmph_dump(cmph_t *mphf, FILE *);
cmph_t *__cmph_load(FILE *f);
#endif

View File

@ -0,0 +1,62 @@
#ifdef ELAPSED_TIME_IN_SECONDS
#undef ELAPSED_TIME_IN_SECONDS
#endif
#ifdef ELAPSED_TIME_IN_uSECONDS
#undef ELAPSED_TIME_IN_uSECONDS
#endif
#ifdef __GNUC__
#include <sys/time.h>
#ifndef WIN32
#include <sys/resource.h>
#endif
#endif
#ifdef __GNUC__
#ifndef __CMPH_TIME_H__
#define __CMPH_TIME_H__
static inline void elapsed_time_in_seconds(double * elapsed_time)
{
struct timeval e_time;
if (gettimeofday(&e_time, NULL) < 0) {
return;
}
*elapsed_time = (double)e_time.tv_sec + ((double)e_time.tv_usec/1000000.0);
}
static inline void dummy_elapsed_time_in_seconds(double * elapsed_time)
{
(void) elapsed_time;
}
static inline void elapsed_time_in_useconds(cmph_uint64 * elapsed_time)
{
struct timeval e_time;
if (gettimeofday(&e_time, NULL) < 0) {
return;
}
*elapsed_time = (cmph_uint64)(e_time.tv_sec*1000000 + e_time.tv_usec);
}
static inline void dummy_elapsed_time_in_useconds(cmph_uint64 * elapsed_time)
{
(void) elapsed_time;
}
#endif
#endif
#ifdef CMPH_TIMING
#ifdef __GNUC__
#define ELAPSED_TIME_IN_SECONDS elapsed_time_in_seconds
#define ELAPSED_TIME_IN_uSECONDS elapsed_time_in_useconds
#else
#define ELAPSED_TIME_IN_SECONDS dummy_elapsed_time_in_seconds
#define ELAPSED_TIME_IN_uSECONDS dummy_elapsed_time_in_useconds
#endif
#else
#ifdef __GNUC__
#define ELAPSED_TIME_IN_SECONDS
#define ELAPSED_TIME_IN_uSECONDS
#else
#define ELAPSED_TIME_IN_SECONDS dummy_elapsed_time_in_seconds
#define ELAPSED_TIME_IN_uSECONDS dummy_elapsed_time_in_useconds
#endif
#endif

View File

@ -0,0 +1,25 @@
#include <glib.h>
#ifndef __CMPH_TYPES_H__
#define __CMPH_TYPES_H__
typedef gint8 cmph_int8;
typedef guint8 cmph_uint8;
typedef gint16 cmph_int16;
typedef guint16 cmph_uint16;
typedef gint32 cmph_int32;
typedef guint32 cmph_uint32;
typedef gint64 cmph_int64;
typedef guint64 cmph_uint64;
typedef enum { CMPH_HASH_JENKINS, CMPH_HASH_COUNT } CMPH_HASH;
extern const char *cmph_hash_names[];
typedef enum { CMPH_BMZ, CMPH_BMZ8, CMPH_CHM, CMPH_BRZ, CMPH_FCH,
CMPH_BDZ, CMPH_BDZ_PH,
CMPH_CHD_PH, CMPH_CHD, CMPH_COUNT } CMPH_ALGO;
extern const char *cmph_names[];
#endif

View File

@ -0,0 +1,327 @@
#include<stdlib.h>
#include<stdio.h>
#include<limits.h>
#include<string.h>
#include"compressed_rank.h"
#include"bitbool.h"
// #define DEBUG
#include"debug.h"
static inline cmph_uint32 compressed_rank_i_log2(cmph_uint32 x)
{
register cmph_uint32 res = 0;
while(x > 1)
{
x >>= 1;
res++;
}
return res;
};
void compressed_rank_init(compressed_rank_t * cr)
{
cr->max_val = 0;
cr->n = 0;
cr->rem_r = 0;
select_init(&cr->sel);
cr->vals_rems = 0;
}
void compressed_rank_destroy(compressed_rank_t * cr)
{
free(cr->vals_rems);
cr->vals_rems = 0;
select_destroy(&cr->sel);
}
void compressed_rank_generate(compressed_rank_t * cr, cmph_uint32 * vals_table, cmph_uint32 n)
{
register cmph_uint32 i,j;
register cmph_uint32 rems_mask;
register cmph_uint32 * select_vec = 0;
cr->n = n;
cr->max_val = vals_table[cr->n - 1];
cr->rem_r = compressed_rank_i_log2(cr->max_val/cr->n);
if(cr->rem_r == 0)
{
cr->rem_r = 1;
}
select_vec = (cmph_uint32 *) calloc(cr->max_val >> cr->rem_r, sizeof(cmph_uint32));
cr->vals_rems = (cmph_uint32 *) calloc(BITS_TABLE_SIZE(cr->n, cr->rem_r), sizeof(cmph_uint32));
rems_mask = (1U << cr->rem_r) - 1U;
for(i = 0; i < cr->n; i++)
{
set_bits_value(cr->vals_rems, i, vals_table[i] & rems_mask, cr->rem_r, rems_mask);
}
for(i = 1, j = 0; i <= cr->max_val >> cr->rem_r; i++)
{
while(i > (vals_table[j] >> cr->rem_r))
{
j++;
}
select_vec[i - 1] = j;
};
// FABIANO: before it was (cr->total_length >> cr->rem_r) + 1. But I wiped out the + 1 because
// I changed the select structure to work up to m, instead of up to m - 1.
select_generate(&cr->sel, select_vec, cr->max_val >> cr->rem_r, cr->n);
free(select_vec);
}
cmph_uint32 compressed_rank_query(compressed_rank_t * cr, cmph_uint32 idx)
{
register cmph_uint32 rems_mask;
register cmph_uint32 val_quot, val_rem;
register cmph_uint32 sel_res, rank;
if(idx > cr->max_val)
{
return cr->n;
}
val_quot = idx >> cr->rem_r;
rems_mask = (1U << cr->rem_r) - 1U;
val_rem = idx & rems_mask;
if(val_quot == 0)
{
rank = sel_res = 0;
}
else
{
sel_res = select_query(&cr->sel, val_quot - 1) + 1;
rank = sel_res - val_quot;
}
do
{
if(GETBIT32(cr->sel.bits_vec, sel_res))
{
break;
}
if(get_bits_value(cr->vals_rems, rank, cr->rem_r, rems_mask) >= val_rem)
{
break;
}
sel_res++;
rank++;
} while(1);
return rank;
}
cmph_uint32 compressed_rank_get_space_usage(compressed_rank_t * cr)
{
register cmph_uint32 space_usage = select_get_space_usage(&cr->sel);
space_usage += BITS_TABLE_SIZE(cr->n, cr->rem_r)*(cmph_uint32)sizeof(cmph_uint32)*8;
space_usage += 3*(cmph_uint32)sizeof(cmph_uint32)*8;
return space_usage;
}
void compressed_rank_dump(compressed_rank_t * cr, char **buf, cmph_uint32 *buflen)
{
register cmph_uint32 sel_size = select_packed_size(&(cr->sel));
register cmph_uint32 vals_rems_size = BITS_TABLE_SIZE(cr->n, cr->rem_r) * (cmph_uint32)sizeof(cmph_uint32);
register cmph_uint32 pos = 0;
char * buf_sel = 0;
cmph_uint32 buflen_sel = 0;
#ifdef DEBUG
cmph_uint32 i;
#endif
*buflen = 4*(cmph_uint32)sizeof(cmph_uint32) + sel_size + vals_rems_size;
DEBUGP("sel_size = %u\n", sel_size);
DEBUGP("vals_rems_size = %u\n", vals_rems_size);
*buf = (char *)calloc(*buflen, sizeof(char));
if (!*buf)
{
*buflen = UINT_MAX;
return;
}
// dumping max_val, n and rem_r
memcpy(*buf, &(cr->max_val), sizeof(cmph_uint32));
pos += (cmph_uint32)sizeof(cmph_uint32);
DEBUGP("max_val = %u\n", cr->max_val);
memcpy(*buf + pos, &(cr->n), sizeof(cmph_uint32));
pos += (cmph_uint32)sizeof(cmph_uint32);
DEBUGP("n = %u\n", cr->n);
memcpy(*buf + pos, &(cr->rem_r), sizeof(cmph_uint32));
pos += (cmph_uint32)sizeof(cmph_uint32);
DEBUGP("rem_r = %u\n", cr->rem_r);
// dumping sel
select_dump(&cr->sel, &buf_sel, &buflen_sel);
memcpy(*buf + pos, &buflen_sel, sizeof(cmph_uint32));
pos += (cmph_uint32)sizeof(cmph_uint32);
DEBUGP("buflen_sel = %u\n", buflen_sel);
memcpy(*buf + pos, buf_sel, buflen_sel);
#ifdef DEBUG
i = 0;
for(i = 0; i < buflen_sel; i++)
{
DEBUGP("pos = %u -- buf_sel[%u] = %u\n", pos, i, *(*buf + pos + i));
}
#endif
pos += buflen_sel;
free(buf_sel);
// dumping vals_rems
memcpy(*buf + pos, cr->vals_rems, vals_rems_size);
#ifdef DEBUG
for(i = 0; i < vals_rems_size; i++)
{
DEBUGP("pos = %u -- vals_rems_size = %u -- vals_rems[%u] = %u\n", pos, vals_rems_size, i, *(*buf + pos + i));
}
#endif
pos += vals_rems_size;
DEBUGP("Dumped compressed rank structure with size %u bytes\n", *buflen);
}
void compressed_rank_load(compressed_rank_t * cr, const char *buf, cmph_uint32 buflen)
{
register cmph_uint32 pos = 0;
cmph_uint32 buflen_sel = 0;
register cmph_uint32 vals_rems_size = 0;
#ifdef DEBUG
cmph_uint32 i;
#endif
// loading max_val, n, and rem_r
memcpy(&(cr->max_val), buf, sizeof(cmph_uint32));
pos += (cmph_uint32)sizeof(cmph_uint32);
DEBUGP("max_val = %u\n", cr->max_val);
memcpy(&(cr->n), buf + pos, sizeof(cmph_uint32));
pos += (cmph_uint32)sizeof(cmph_uint32);
DEBUGP("n = %u\n", cr->n);
memcpy(&(cr->rem_r), buf + pos, sizeof(cmph_uint32));
pos += (cmph_uint32)sizeof(cmph_uint32);
DEBUGP("rem_r = %u\n", cr->rem_r);
// loading sel
memcpy(&buflen_sel, buf + pos, sizeof(cmph_uint32));
pos += (cmph_uint32)sizeof(cmph_uint32);
DEBUGP("buflen_sel = %u\n", buflen_sel);
select_load(&cr->sel, buf + pos, buflen_sel);
#ifdef DEBUG
i = 0;
for(i = 0; i < buflen_sel; i++)
{
DEBUGP("pos = %u -- buf_sel[%u] = %u\n", pos, i, *(buf + pos + i));
}
#endif
pos += buflen_sel;
// loading vals_rems
if(cr->vals_rems)
{
free(cr->vals_rems);
}
vals_rems_size = BITS_TABLE_SIZE(cr->n, cr->rem_r);
cr->vals_rems = (cmph_uint32 *) calloc(vals_rems_size, sizeof(cmph_uint32));
vals_rems_size *= 4;
memcpy(cr->vals_rems, buf + pos, vals_rems_size);
#ifdef DEBUG
for(i = 0; i < vals_rems_size; i++)
{
DEBUGP("pos = %u -- vals_rems_size = %u -- vals_rems[%u] = %u\n", pos, vals_rems_size, i, *(buf + pos + i));
}
#endif
pos += vals_rems_size;
DEBUGP("Loaded compressed rank structure with size %u bytes\n", buflen);
}
void compressed_rank_pack(compressed_rank_t *cr, void *cr_packed)
{
if (cr && cr_packed)
{
char *buf = NULL;
cmph_uint32 buflen = 0;
compressed_rank_dump(cr, &buf, &buflen);
memcpy(cr_packed, buf, buflen);
free(buf);
}
}
cmph_uint32 compressed_rank_packed_size(compressed_rank_t *cr)
{
register cmph_uint32 sel_size = select_packed_size(&cr->sel);
register cmph_uint32 vals_rems_size = BITS_TABLE_SIZE(cr->n, cr->rem_r) * (cmph_uint32)sizeof(cmph_uint32);
return 4 * (cmph_uint32)sizeof(cmph_uint32) + sel_size + vals_rems_size;
}
cmph_uint32 compressed_rank_query_packed(void * cr_packed, cmph_uint32 idx)
{
// unpacking cr_packed
register cmph_uint32 *ptr = (cmph_uint32 *)cr_packed;
register cmph_uint32 max_val = *ptr++;
register cmph_uint32 n = *ptr++;
register cmph_uint32 rem_r = *ptr++;
register cmph_uint32 buflen_sel = *ptr++;
register cmph_uint32 * sel_packed = ptr;
register cmph_uint32 * bits_vec = sel_packed + 2; // skipping n and m
register cmph_uint32 * vals_rems = (ptr += (buflen_sel >> 2));
// compressed sequence query computation
register cmph_uint32 rems_mask;
register cmph_uint32 val_quot, val_rem;
register cmph_uint32 sel_res, rank;
if(idx > max_val)
{
return n;
}
val_quot = idx >> rem_r;
rems_mask = (1U << rem_r) - 1U;
val_rem = idx & rems_mask;
if(val_quot == 0)
{
rank = sel_res = 0;
}
else
{
sel_res = select_query_packed(sel_packed, val_quot - 1) + 1;
rank = sel_res - val_quot;
}
do
{
if(GETBIT32(bits_vec, sel_res))
{
break;
}
if(get_bits_value(vals_rems, rank, rem_r, rems_mask) >= val_rem)
{
break;
}
sel_res++;
rank++;
} while(1);
return rank;
}

View File

@ -0,0 +1,55 @@
#ifndef __CMPH_COMPRESSED_RANK_H__
#define __CMPH_COMPRESSED_RANK_H__
#include "select.h"
struct _compressed_rank_t
{
cmph_uint32 max_val;
cmph_uint32 n; // number of values stored in vals_rems
// The length in bits of each value is decomposed into two compnents: the lg(n) MSBs are stored in rank_select data structure
// the remaining LSBs are stored in a table of n cells, each one of rem_r bits.
cmph_uint32 rem_r;
select_t sel;
cmph_uint32 * vals_rems;
};
typedef struct _compressed_rank_t compressed_rank_t;
void compressed_rank_init(compressed_rank_t * cr);
void compressed_rank_destroy(compressed_rank_t * cr);
void compressed_rank_generate(compressed_rank_t * cr, cmph_uint32 * vals_table, cmph_uint32 n);
cmph_uint32 compressed_rank_query(compressed_rank_t * cr, cmph_uint32 idx);
cmph_uint32 compressed_rank_get_space_usage(compressed_rank_t * cr);
void compressed_rank_dump(compressed_rank_t * cr, char **buf, cmph_uint32 *buflen);
void compressed_rank_load(compressed_rank_t * cr, const char *buf, cmph_uint32 buflen);
/** \fn void compressed_rank_pack(compressed_rank_t *cr, void *cr_packed);
* \brief Support the ability to pack a compressed_rank structure into a preallocated contiguous memory space pointed by cr_packed.
* \param cr points to the compressed_rank structure
* \param cr_packed pointer to the contiguous memory area used to store the compressed_rank structure. The size of cr_packed must be at least @see compressed_rank_packed_size
*/
void compressed_rank_pack(compressed_rank_t *cr, void *cr_packed);
/** \fn cmph_uint32 compressed_rank_packed_size(compressed_rank_t *cr);
* \brief Return the amount of space needed to pack a compressed_rank structure.
* \return the size of the packed compressed_rank structure or zero for failures
*/
cmph_uint32 compressed_rank_packed_size(compressed_rank_t *cr);
/** \fn cmph_uint32 compressed_rank_query_packed(void * cr_packed, cmph_uint32 idx);
* \param cr_packed is a pointer to a contiguous memory area
* \param idx is an index to compute the rank
* \return an integer that represents the compressed_rank value.
*/
cmph_uint32 compressed_rank_query_packed(void * cr_packed, cmph_uint32 idx);
#endif

View File

@ -0,0 +1,384 @@
#include "compressed_seq.h"
#include <stdlib.h>
#include <stdio.h>
#include <assert.h>
#include <limits.h>
#include <string.h>
#include "bitbool.h"
// #define DEBUG
#include "debug.h"
static inline cmph_uint32 compressed_seq_i_log2(cmph_uint32 x)
{
register cmph_uint32 res = 0;
while(x > 1)
{
x >>= 1;
res++;
}
return res;
};
void compressed_seq_init(compressed_seq_t * cs)
{
select_init(&cs->sel);
cs->n = 0;
cs->rem_r = 0;
cs->length_rems = 0;
cs->total_length = 0;
cs->store_table = 0;
}
void compressed_seq_destroy(compressed_seq_t * cs)
{
free(cs->store_table);
cs->store_table = 0;
free(cs->length_rems);
cs->length_rems = 0;
select_destroy(&cs->sel);
};
void compressed_seq_generate(compressed_seq_t * cs, cmph_uint32 * vals_table, cmph_uint32 n)
{
register cmph_uint32 i;
// lengths: represents lengths of encoded values
register cmph_uint32 * lengths = (cmph_uint32 *)calloc(n, sizeof(cmph_uint32));
register cmph_uint32 rems_mask;
register cmph_uint32 stored_value;
cs->n = n;
cs->total_length = 0;
for(i = 0; i < cs->n; i++)
{
if(vals_table[i] == 0)
{
lengths[i] = 0;
}
else
{
lengths[i] = compressed_seq_i_log2(vals_table[i] + 1);
cs->total_length += lengths[i];
};
};
if(cs->store_table)
{
free(cs->store_table);
}
cs->store_table = (cmph_uint32 *) calloc(((cs->total_length + 31) >> 5), sizeof(cmph_uint32));
cs->total_length = 0;
for(i = 0; i < cs->n; i++)
{
if(vals_table[i] == 0)
continue;
stored_value = vals_table[i] - ((1U << lengths[i]) - 1U);
set_bits_at_pos(cs->store_table, cs->total_length, stored_value, lengths[i]);
cs->total_length += lengths[i];
};
cs->rem_r = compressed_seq_i_log2(cs->total_length/cs->n);
if(cs->rem_r == 0)
{
cs->rem_r = 1;
}
if(cs->length_rems)
{
free(cs->length_rems);
}
cs->length_rems = (cmph_uint32 *) calloc(BITS_TABLE_SIZE(cs->n, cs->rem_r), sizeof(cmph_uint32));
rems_mask = (1U << cs->rem_r) - 1U;
cs->total_length = 0;
for(i = 0; i < cs->n; i++)
{
cs->total_length += lengths[i];
set_bits_value(cs->length_rems, i, cs->total_length & rems_mask, cs->rem_r, rems_mask);
lengths[i] = cs->total_length >> cs->rem_r;
};
select_init(&cs->sel);
// FABIANO: before it was (cs->total_length >> cs->rem_r) + 1. But I wiped out the + 1 because
// I changed the select structure to work up to m, instead of up to m - 1.
select_generate(&cs->sel, lengths, cs->n, (cs->total_length >> cs->rem_r));
free(lengths);
};
cmph_uint32 compressed_seq_get_space_usage(compressed_seq_t * cs)
{
register cmph_uint32 space_usage = select_get_space_usage(&cs->sel);
space_usage += ((cs->total_length + 31) >> 5) * (cmph_uint32)sizeof(cmph_uint32) * 8;
space_usage += BITS_TABLE_SIZE(cs->n, cs->rem_r) * (cmph_uint32)sizeof(cmph_uint32) * 8;
return 4 * (cmph_uint32)sizeof(cmph_uint32) * 8 + space_usage;
}
cmph_uint32 compressed_seq_query(compressed_seq_t * cs, cmph_uint32 idx)
{
register cmph_uint32 enc_idx, enc_length;
register cmph_uint32 rems_mask;
register cmph_uint32 stored_value;
register cmph_uint32 sel_res;
assert(idx < cs->n); // FABIANO ADDED
rems_mask = (1U << cs->rem_r) - 1U;
if(idx == 0)
{
enc_idx = 0;
sel_res = select_query(&cs->sel, idx);
}
else
{
sel_res = select_query(&cs->sel, idx - 1);
enc_idx = (sel_res - (idx - 1)) << cs->rem_r;
enc_idx += get_bits_value(cs->length_rems, idx-1, cs->rem_r, rems_mask);
sel_res = select_next_query(&cs->sel, sel_res);
};
enc_length = (sel_res - idx) << cs->rem_r;
enc_length += get_bits_value(cs->length_rems, idx, cs->rem_r, rems_mask);
enc_length -= enc_idx;
if(enc_length == 0)
return 0;
stored_value = get_bits_at_pos(cs->store_table, enc_idx, enc_length);
return stored_value + ((1U << enc_length) - 1U);
};
void compressed_seq_dump(compressed_seq_t * cs, char ** buf, cmph_uint32 * buflen)
{
register cmph_uint32 sel_size = select_packed_size(&(cs->sel));
register cmph_uint32 length_rems_size = BITS_TABLE_SIZE(cs->n, cs->rem_r) * 4;
register cmph_uint32 store_table_size = ((cs->total_length + 31) >> 5) * 4;
register cmph_uint32 pos = 0;
char * buf_sel = 0;
cmph_uint32 buflen_sel = 0;
#ifdef DEBUG
cmph_uint32 i;
#endif
*buflen = 4*(cmph_uint32)sizeof(cmph_uint32) + sel_size + length_rems_size + store_table_size;
DEBUGP("sel_size = %u\n", sel_size);
DEBUGP("length_rems_size = %u\n", length_rems_size);
DEBUGP("store_table_size = %u\n", store_table_size);
*buf = (char *)calloc(*buflen, sizeof(char));
if (!*buf)
{
*buflen = UINT_MAX;
return;
}
// dumping n, rem_r and total_length
memcpy(*buf, &(cs->n), sizeof(cmph_uint32));
pos += (cmph_uint32)sizeof(cmph_uint32);
DEBUGP("n = %u\n", cs->n);
memcpy(*buf + pos, &(cs->rem_r), sizeof(cmph_uint32));
pos += (cmph_uint32)sizeof(cmph_uint32);
DEBUGP("rem_r = %u\n", cs->rem_r);
memcpy(*buf + pos, &(cs->total_length), sizeof(cmph_uint32));
pos += (cmph_uint32)sizeof(cmph_uint32);
DEBUGP("total_length = %u\n", cs->total_length);
// dumping sel
select_dump(&cs->sel, &buf_sel, &buflen_sel);
memcpy(*buf + pos, &buflen_sel, sizeof(cmph_uint32));
pos += (cmph_uint32)sizeof(cmph_uint32);
DEBUGP("buflen_sel = %u\n", buflen_sel);
memcpy(*buf + pos, buf_sel, buflen_sel);
#ifdef DEBUG
i = 0;
for(i = 0; i < buflen_sel; i++)
{
DEBUGP("pos = %u -- buf_sel[%u] = %u\n", pos, i, *(*buf + pos + i));
}
#endif
pos += buflen_sel;
free(buf_sel);
// dumping length_rems
memcpy(*buf + pos, cs->length_rems, length_rems_size);
#ifdef DEBUG
for(i = 0; i < length_rems_size; i++)
{
DEBUGP("pos = %u -- length_rems_size = %u -- length_rems[%u] = %u\n", pos, length_rems_size, i, *(*buf + pos + i));
}
#endif
pos += length_rems_size;
// dumping store_table
memcpy(*buf + pos, cs->store_table, store_table_size);
#ifdef DEBUG
for(i = 0; i < store_table_size; i++)
{
DEBUGP("pos = %u -- store_table_size = %u -- store_table[%u] = %u\n", pos, store_table_size, i, *(*buf + pos + i));
}
#endif
DEBUGP("Dumped compressed sequence structure with size %u bytes\n", *buflen);
}
void compressed_seq_load(compressed_seq_t * cs, const char * buf, cmph_uint32 buflen)
{
register cmph_uint32 pos = 0;
cmph_uint32 buflen_sel = 0;
register cmph_uint32 length_rems_size = 0;
register cmph_uint32 store_table_size = 0;
#ifdef DEBUG
cmph_uint32 i;
#endif
// loading n, rem_r and total_length
memcpy(&(cs->n), buf, sizeof(cmph_uint32));
pos += (cmph_uint32)sizeof(cmph_uint32);
DEBUGP("n = %u\n", cs->n);
memcpy(&(cs->rem_r), buf + pos, sizeof(cmph_uint32));
pos += (cmph_uint32)sizeof(cmph_uint32);
DEBUGP("rem_r = %u\n", cs->rem_r);
memcpy(&(cs->total_length), buf + pos, sizeof(cmph_uint32));
pos += (cmph_uint32)sizeof(cmph_uint32);
DEBUGP("total_length = %u\n", cs->total_length);
// loading sel
memcpy(&buflen_sel, buf + pos, sizeof(cmph_uint32));
pos += (cmph_uint32)sizeof(cmph_uint32);
DEBUGP("buflen_sel = %u\n", buflen_sel);
select_load(&cs->sel, buf + pos, buflen_sel);
#ifdef DEBUG
i = 0;
for(i = 0; i < buflen_sel; i++)
{
DEBUGP("pos = %u -- buf_sel[%u] = %u\n", pos, i, *(buf + pos + i));
}
#endif
pos += buflen_sel;
// loading length_rems
if(cs->length_rems)
{
free(cs->length_rems);
}
length_rems_size = BITS_TABLE_SIZE(cs->n, cs->rem_r);
cs->length_rems = (cmph_uint32 *) calloc(length_rems_size, sizeof(cmph_uint32));
length_rems_size *= 4;
memcpy(cs->length_rems, buf + pos, length_rems_size);
#ifdef DEBUG
for(i = 0; i < length_rems_size; i++)
{
DEBUGP("pos = %u -- length_rems_size = %u -- length_rems[%u] = %u\n", pos, length_rems_size, i, *(buf + pos + i));
}
#endif
pos += length_rems_size;
// loading store_table
store_table_size = ((cs->total_length + 31) >> 5);
if(cs->store_table)
{
free(cs->store_table);
}
cs->store_table = (cmph_uint32 *) calloc(store_table_size, sizeof(cmph_uint32));
store_table_size *= 4;
memcpy(cs->store_table, buf + pos, store_table_size);
#ifdef DEBUG
for(i = 0; i < store_table_size; i++)
{
DEBUGP("pos = %u -- store_table_size = %u -- store_table[%u] = %u\n", pos, store_table_size, i, *(buf + pos + i));
}
#endif
DEBUGP("Loaded compressed sequence structure with size %u bytes\n", buflen);
}
void compressed_seq_pack(compressed_seq_t *cs, void *cs_packed)
{
if (cs && cs_packed)
{
char *buf = NULL;
cmph_uint32 buflen = 0;
compressed_seq_dump(cs, &buf, &buflen);
memcpy(cs_packed, buf, buflen);
free(buf);
}
}
cmph_uint32 compressed_seq_packed_size(compressed_seq_t *cs)
{
register cmph_uint32 sel_size = select_packed_size(&cs->sel);
register cmph_uint32 store_table_size = ((cs->total_length + 31) >> 5) * (cmph_uint32)sizeof(cmph_uint32);
register cmph_uint32 length_rems_size = BITS_TABLE_SIZE(cs->n, cs->rem_r) * (cmph_uint32)sizeof(cmph_uint32);
return 4 * (cmph_uint32)sizeof(cmph_uint32) + sel_size + store_table_size + length_rems_size;
}
cmph_uint32 compressed_seq_query_packed(void * cs_packed, cmph_uint32 idx)
{
// unpacking cs_packed
register cmph_uint32 *ptr = (cmph_uint32 *)cs_packed;
register cmph_uint32 n = *ptr++;
register cmph_uint32 rem_r = *ptr++;
register cmph_uint32 buflen_sel, length_rems_size, enc_idx, enc_length;
// compressed sequence query computation
register cmph_uint32 rems_mask, stored_value, sel_res;
register cmph_uint32 *sel_packed, *length_rems, *store_table;
ptr++; // skipping total_length
// register cmph_uint32 total_length = *ptr++;
buflen_sel = *ptr++;
sel_packed = ptr;
length_rems = (ptr += (buflen_sel >> 2));
length_rems_size = BITS_TABLE_SIZE(n, rem_r);
store_table = (ptr += length_rems_size);
rems_mask = (1U << rem_r) - 1U;
if(idx == 0)
{
enc_idx = 0;
sel_res = select_query_packed(sel_packed, idx);
}
else
{
sel_res = select_query_packed(sel_packed, idx - 1);
enc_idx = (sel_res - (idx - 1)) << rem_r;
enc_idx += get_bits_value(length_rems, idx-1, rem_r, rems_mask);
sel_res = select_next_query_packed(sel_packed, sel_res);
};
enc_length = (sel_res - idx) << rem_r;
enc_length += get_bits_value(length_rems, idx, rem_r, rems_mask);
enc_length -= enc_idx;
if(enc_length == 0)
return 0;
stored_value = get_bits_at_pos(store_table, enc_idx, enc_length);
return stored_value + ((1U << enc_length) - 1U);
}

View File

@ -0,0 +1,84 @@
#ifndef __CMPH_COMPRESSED_SEQ_H__
#define __CMPH_COMPRESSED_SEQ_H__
#include"select.h"
struct _compressed_seq_t
{
cmph_uint32 n; // number of values stored in store_table
// The length in bits of each value is decomposed into two compnents: the lg(n) MSBs are stored in rank_select data structure
// the remaining LSBs are stored in a table of n cells, each one of rem_r bits.
cmph_uint32 rem_r;
cmph_uint32 total_length; // total length in bits of stored_table
select_t sel;
cmph_uint32 * length_rems;
cmph_uint32 * store_table;
};
typedef struct _compressed_seq_t compressed_seq_t;
/** \fn void compressed_seq_init(compressed_seq_t * cs);
* \brief Initialize a compressed sequence structure.
* \param cs points to the compressed sequence structure to be initialized
*/
void compressed_seq_init(compressed_seq_t * cs);
/** \fn void compressed_seq_destroy(compressed_seq_t * cs);
* \brief Destroy a compressed sequence given as input.
* \param cs points to the compressed sequence structure to be destroyed
*/
void compressed_seq_destroy(compressed_seq_t * cs);
/** \fn void compressed_seq_generate(compressed_seq_t * cs, cmph_uint32 * vals_table, cmph_uint32 n);
* \brief Generate a compressed sequence from an input array with n values.
* \param cs points to the compressed sequence structure
* \param vals_table poiter to the array given as input
* \param n number of values in @see vals_table
*/
void compressed_seq_generate(compressed_seq_t * cs, cmph_uint32 * vals_table, cmph_uint32 n);
/** \fn cmph_uint32 compressed_seq_query(compressed_seq_t * cs, cmph_uint32 idx);
* \brief Returns the value stored at index @see idx of the compressed sequence structure.
* \param cs points to the compressed sequence structure
* \param idx index to retrieve the value from
* \return the value stored at index @see idx of the compressed sequence structure
*/
cmph_uint32 compressed_seq_query(compressed_seq_t * cs, cmph_uint32 idx);
/** \fn cmph_uint32 compressed_seq_get_space_usage(compressed_seq_t * cs);
* \brief Returns amount of space (in bits) to store the compressed sequence.
* \param cs points to the compressed sequence structure
* \return the amount of space (in bits) to store @see cs
*/
cmph_uint32 compressed_seq_get_space_usage(compressed_seq_t * cs);
void compressed_seq_dump(compressed_seq_t * cs, char ** buf, cmph_uint32 * buflen);
void compressed_seq_load(compressed_seq_t * cs, const char * buf, cmph_uint32 buflen);
/** \fn void compressed_seq_pack(compressed_seq_t *cs, void *cs_packed);
* \brief Support the ability to pack a compressed sequence structure into a preallocated contiguous memory space pointed by cs_packed.
* \param cs points to the compressed sequence structure
* \param cs_packed pointer to the contiguous memory area used to store the compressed sequence structure. The size of cs_packed must be at least @see compressed_seq_packed_size
*/
void compressed_seq_pack(compressed_seq_t *cs, void *cs_packed);
/** \fn cmph_uint32 compressed_seq_packed_size(compressed_seq_t *cs);
* \brief Return the amount of space needed to pack a compressed sequence structure.
* \return the size of the packed compressed sequence structure or zero for failures
*/
cmph_uint32 compressed_seq_packed_size(compressed_seq_t *cs);
/** \fn cmph_uint32 compressed_seq_query_packed(void * cs_packed, cmph_uint32 idx);
* \brief Returns the value stored at index @see idx of the packed compressed sequence structure.
* \param cs_packed is a pointer to a contiguous memory area
* \param idx is the index to retrieve the value from
* \return the value stored at index @see idx of the packed compressed sequence structure
*/
cmph_uint32 compressed_seq_query_packed(void * cs_packed, cmph_uint32 idx);
#endif

53
girepository/cmph/debug.h Normal file
View File

@ -0,0 +1,53 @@
#ifdef DEBUGP
#undef DEBUGP
#endif
#ifdef __cplusplus
#include <cstdio>
#ifdef WIN32
#include <cstring>
#endif
#else
#include <stdio.h>
#ifdef WIN32
#include <string.h>
#endif
#endif
#ifndef __GNUC__
#ifndef __DEBUG_H__
#define __DEBUG_H__
#include <stdarg.h>
static void debugprintf(const char *format, ...)
{
va_list ap;
char *f = NULL;
const char *p="%s:%d ";
size_t plen = strlen(p);
va_start(ap, format);
f = (char *)malloc(plen + strlen(format) + 1);
if (!f) return;
memcpy(f, p, plen);
memcpy(f + plen, format, strlen(format) + 1);
vfprintf(stderr, f, ap);
va_end(ap);
free(f);
}
static void dummyprintf(const char *format, ...)
{}
#endif
#endif
#ifdef DEBUG
#ifndef __GNUC__
#define DEBUGP debugprintf
#else
#define DEBUGP(args...) do { fprintf(stderr, "%s:%d ", __FILE__, __LINE__); fprintf(stderr, ## args); } while(0)
#endif
#else
#ifndef __GNUC__
#define DEBUGP dummyprintf
#else
#define DEBUGP(args...)
#endif
#endif

View File

@ -0,0 +1,49 @@
#include "djb2_hash.h"
#include <stdlib.h>
djb2_state_t *djb2_state_new()
{
djb2_state_t *state = (djb2_state_t *)malloc(sizeof(djb2_state_t));
state->hashfunc = CMPH_HASH_DJB2;
return state;
}
void djb2_state_destroy(djb2_state_t *state)
{
free(state);
}
cmph_uint32 djb2_hash(djb2_state_t *state, const char *k, cmph_uint32 keylen)
{
register cmph_uint32 hash = 5381;
const unsigned char *ptr = (unsigned char *)k;
cmph_uint32 i = 0;
while (i < keylen)
{
hash = hash*33 ^ *ptr;
++ptr, ++i;
}
return hash;
}
void djb2_state_dump(djb2_state_t *state, char **buf, cmph_uint32 *buflen)
{
*buf = NULL;
*buflen = 0;
return;
}
djb2_state_t *djb2_state_copy(djb2_state_t *src_state)
{
djb2_state_t *dest_state = (djb2_state_t *)malloc(sizeof(djb2_state_t));
dest_state->hashfunc = src_state->hashfunc;
return dest_state;
}
djb2_state_t *djb2_state_load(const char *buf, cmph_uint32 buflen)
{
djb2_state_t *state = (djb2_state_t *)malloc(sizeof(djb2_state_t));
state->hashfunc = CMPH_HASH_DJB2;
return state;
}

View File

@ -0,0 +1,18 @@
#ifndef __DJB2_HASH_H__
#define __DJB2_HASH_H__
#include "hash.h"
typedef struct __djb2_state_t
{
CMPH_HASH hashfunc;
} djb2_state_t;
djb2_state_t *djb2_state_new();
cmph_uint32 djb2_hash(djb2_state_t *state, const char *k, cmph_uint32 keylen);
void djb2_state_dump(djb2_state_t *state, char **buf, cmph_uint32 *buflen);
djb2_state_t *djb2_state_copy(djb2_state_t *src_state);
djb2_state_t *djb2_state_load(const char *buf, cmph_uint32 buflen);
void djb2_state_destroy(djb2_state_t *state);
#endif

539
girepository/cmph/fch.c Normal file
View File

@ -0,0 +1,539 @@
#include "fch.h"
#include "cmph_structs.h"
#include "fch_structs.h"
#include "hash.h"
#include "bitbool.h"
#include "fch_buckets.h"
#include <math.h>
#include <stdlib.h>
#include <stdio.h>
#include <assert.h>
#include <string.h>
#include <errno.h>
#define INDEX 0 /* alignment index within a bucket */
//#define DEBUG
#include "debug.h"
static fch_buckets_t * mapping(cmph_config_t *mph);
static cmph_uint32 * ordering(fch_buckets_t * buckets);
static cmph_uint8 check_for_collisions_h2(fch_config_data_t *fch, fch_buckets_t * buckets, cmph_uint32 *sorted_indexes);
static void permut(cmph_uint32 * vector, cmph_uint32 n);
static cmph_uint8 searching(fch_config_data_t *fch, fch_buckets_t *buckets, cmph_uint32 *sorted_indexes);
fch_config_data_t *fch_config_new(void)
{
fch_config_data_t *fch;
fch = (fch_config_data_t *)malloc(sizeof(fch_config_data_t));
assert(fch);
memset(fch, 0, sizeof(fch_config_data_t));
fch->hashfuncs[0] = CMPH_HASH_JENKINS;
fch->hashfuncs[1] = CMPH_HASH_JENKINS;
fch->m = fch->b = 0;
fch->c = fch->p1 = fch->p2 = 0.0;
fch->g = NULL;
fch->h1 = NULL;
fch->h2 = NULL;
return fch;
}
void fch_config_destroy(cmph_config_t *mph)
{
fch_config_data_t *data = (fch_config_data_t *)mph->data;
//DEBUGP("Destroying algorithm dependent data\n");
free(data);
}
void fch_config_set_hashfuncs(cmph_config_t *mph, CMPH_HASH *hashfuncs)
{
fch_config_data_t *fch = (fch_config_data_t *)mph->data;
CMPH_HASH *hashptr = hashfuncs;
cmph_uint32 i = 0;
while(*hashptr != CMPH_HASH_COUNT)
{
if (i >= 2) break; //fch only uses two hash functions
fch->hashfuncs[i] = *hashptr;
++i, ++hashptr;
}
}
cmph_uint32 mixh10h11h12(cmph_uint32 b, double p1, double p2, cmph_uint32 initial_index)
{
register cmph_uint32 int_p2 = (cmph_uint32)p2;
if (initial_index < p1) initial_index %= int_p2; /* h11 o h10 */
else { /* h12 o h10 */
initial_index %= b;
if(initial_index < p2) initial_index += int_p2;
}
return initial_index;
}
cmph_uint32 fch_calc_b(double c, cmph_uint32 m)
{
return (cmph_uint32)ceil((c*m)/(log((double)m)/log(2.0) + 1));
}
double fch_calc_p1(cmph_uint32 m)
{
return ceil(0.55*m);
}
double fch_calc_p2(cmph_uint32 b)
{
return ceil(0.3*b);
}
static fch_buckets_t * mapping(cmph_config_t *mph)
{
cmph_uint32 i = 0;
fch_buckets_t *buckets = NULL;
fch_config_data_t *fch = (fch_config_data_t *)mph->data;
if (fch->h1) hash_state_destroy(fch->h1);
fch->h1 = hash_state_new(fch->hashfuncs[0], fch->m);
fch->b = fch_calc_b(fch->c, fch->m);
fch->p1 = fch_calc_p1(fch->m);
fch->p2 = fch_calc_p2(fch->b);
//DEBUGP("b:%u p1:%f p2:%f\n", fch->b, fch->p1, fch->p2);
buckets = fch_buckets_new(fch->b);
mph->key_source->rewind(mph->key_source->data);
for(i = 0; i < fch->m; i++)
{
cmph_uint32 h1, keylen;
char *key = NULL;
mph->key_source->read(mph->key_source->data, &key, &keylen);
h1 = hash(fch->h1, key, keylen) % fch->m;
h1 = mixh10h11h12 (fch->b, fch->p1, fch->p2, h1);
fch_buckets_insert(buckets, h1, key, keylen);
key = NULL; // transger memory ownership
}
return buckets;
}
// returns the buckets indexes sorted by their sizes.
static cmph_uint32 * ordering(fch_buckets_t * buckets)
{
return fch_buckets_get_indexes_sorted_by_size(buckets);
}
/* Check whether function h2 causes collisions among the keys of each bucket */
static cmph_uint8 check_for_collisions_h2(fch_config_data_t *fch, fch_buckets_t * buckets, cmph_uint32 *sorted_indexes)
{
//cmph_uint32 max_size = fch_buckets_get_max_size(buckets);
cmph_uint8 * hashtable = (cmph_uint8 *)calloc((size_t)fch->m, sizeof(cmph_uint8));
cmph_uint32 nbuckets = fch_buckets_get_nbuckets(buckets);
cmph_uint32 i = 0, index = 0, j =0;
for (i = 0; i < nbuckets; i++)
{
cmph_uint32 nkeys = fch_buckets_get_size(buckets, sorted_indexes[i]);
memset(hashtable, 0, (size_t)fch->m);
//DEBUGP("bucket %u -- nkeys: %u\n", i, nkeys);
for (j = 0; j < nkeys; j++)
{
char * key = fch_buckets_get_key(buckets, sorted_indexes[i], j);
cmph_uint32 keylen = fch_buckets_get_keylength(buckets, sorted_indexes[i], j);
index = hash(fch->h2, key, keylen) % fch->m;
if(hashtable[index]) { // collision detected
free(hashtable);
return 1;
}
hashtable[index] = 1;
}
}
free(hashtable);
return 0;
}
static void permut(cmph_uint32 * vector, cmph_uint32 n)
{
cmph_uint32 i, j, b;
for (i = 0; i < n; i++) {
j = (cmph_uint32) rand() % n;
b = vector[i];
vector[i] = vector[j];
vector[j] = b;
}
}
static cmph_uint8 searching(fch_config_data_t *fch, fch_buckets_t *buckets, cmph_uint32 *sorted_indexes)
{
cmph_uint32 * random_table = (cmph_uint32 *) calloc((size_t)fch->m, sizeof(cmph_uint32));
cmph_uint32 * map_table = (cmph_uint32 *) calloc((size_t)fch->m, sizeof(cmph_uint32));
cmph_uint32 iteration_to_generate_h2 = 0;
cmph_uint32 searching_iterations = 0;
cmph_uint8 restart = 0;
cmph_uint32 nbuckets = fch_buckets_get_nbuckets(buckets);
cmph_uint32 i, j, z, counter = 0, filled_count = 0;
if (fch->g) free (fch->g);
fch->g = (cmph_uint32 *) calloc((size_t)fch->b, sizeof(cmph_uint32));
//DEBUGP("max bucket size: %u\n", fch_buckets_get_max_size(buckets));
for(i = 0; i < fch->m; i++)
{
random_table[i] = i;
}
permut(random_table, fch->m);
for(i = 0; i < fch->m; i++)
{
map_table[random_table[i]] = i;
}
do {
if (fch->h2) hash_state_destroy(fch->h2);
fch->h2 = hash_state_new(fch->hashfuncs[1], fch->m);
restart = check_for_collisions_h2(fch, buckets, sorted_indexes);
filled_count = 0;
if (!restart)
{
searching_iterations++; iteration_to_generate_h2 = 0;
//DEBUGP("searching_iterations: %u\n", searching_iterations);
}
else {
iteration_to_generate_h2++;
//DEBUGP("iteration_to_generate_h2: %u\n", iteration_to_generate_h2);
}
for(i = 0; (i < nbuckets) && !restart; i++) {
cmph_uint32 bucketsize = fch_buckets_get_size(buckets, sorted_indexes[i]);
if (bucketsize == 0)
{
restart = 0; // false
break;
}
else restart = 1; // true
for(z = 0; (z < (fch->m - filled_count)) && restart; z++) {
char * key = fch_buckets_get_key(buckets, sorted_indexes[i], INDEX);
cmph_uint32 keylen = fch_buckets_get_keylength(buckets, sorted_indexes[i], INDEX);
cmph_uint32 h2 = hash(fch->h2, key, keylen) % fch->m;
counter = 0;
restart = 0; // false
fch->g[sorted_indexes[i]] = (fch->m + random_table[filled_count + z] - h2) % fch->m;
//DEBUGP("g[%u]: %u\n", sorted_indexes[i], fch->g[sorted_indexes[i]]);
j = INDEX;
do {
cmph_uint32 index = 0;
key = fch_buckets_get_key(buckets, sorted_indexes[i], j);
keylen = fch_buckets_get_keylength(buckets, sorted_indexes[i], j);
h2 = hash(fch->h2, key, keylen) % fch->m;
index = (h2 + fch->g[sorted_indexes[i]]) % fch->m;
//DEBUGP("key:%s keylen:%u index: %u h2:%u bucketsize:%u\n", key, keylen, index, h2, bucketsize);
if (map_table[index] >= filled_count) {
cmph_uint32 y = map_table[index];
cmph_uint32 ry = random_table[y];
random_table[y] = random_table[filled_count];
random_table[filled_count] = ry;
map_table[random_table[y]] = y;
map_table[random_table[filled_count]] = filled_count;
filled_count++;
counter ++;
}
else {
restart = 1; // true
filled_count = filled_count - counter;
counter = 0;
break;
}
j = (j + 1) % bucketsize;
} while(j % bucketsize != INDEX);
}
//getchar();
}
} while(restart && (searching_iterations < 10) && (iteration_to_generate_h2 < 1000));
free(map_table);
free(random_table);
return restart;
}
cmph_t *fch_new(cmph_config_t *mph, double c)
{
cmph_t *mphf = NULL;
fch_data_t *fchf = NULL;
cmph_uint32 iterations = 100;
cmph_uint8 restart_mapping = 0;
fch_buckets_t * buckets = NULL;
cmph_uint32 * sorted_indexes = NULL;
fch_config_data_t *fch = (fch_config_data_t *)mph->data;
fch->m = mph->key_source->nkeys;
//DEBUGP("m: %f\n", fch->m);
if (c <= 2) c = 2.6; // validating restrictions over parameter c.
fch->c = c;
//DEBUGP("c: %f\n", fch->c);
fch->h1 = NULL;
fch->h2 = NULL;
fch->g = NULL;
do
{
if (mph->verbosity)
{
fprintf(stderr, "Entering mapping step for mph creation of %u keys\n", fch->m);
}
if (buckets) fch_buckets_destroy(buckets);
buckets = mapping(mph);
if (mph->verbosity)
{
fprintf(stderr, "Starting ordering step\n");
}
if (sorted_indexes) free (sorted_indexes);
sorted_indexes = ordering(buckets);
if (mph->verbosity)
{
fprintf(stderr, "Starting searching step.\n");
}
restart_mapping = searching(fch, buckets, sorted_indexes);
iterations--;
} while(restart_mapping && iterations > 0);
if (buckets) fch_buckets_destroy(buckets);
if (sorted_indexes) free (sorted_indexes);
if (iterations == 0) return NULL;
mphf = (cmph_t *)malloc(sizeof(cmph_t));
mphf->algo = mph->algo;
fchf = (fch_data_t *)malloc(sizeof(fch_data_t));
fchf->g = fch->g;
fch->g = NULL; //transfer memory ownership
fchf->h1 = fch->h1;
fch->h1 = NULL; //transfer memory ownership
fchf->h2 = fch->h2;
fch->h2 = NULL; //transfer memory ownership
fchf->p2 = fch->p2;
fchf->p1 = fch->p1;
fchf->b = fch->b;
fchf->c = fch->c;
fchf->m = fch->m;
mphf->data = fchf;
mphf->size = fch->m;
//DEBUGP("Successfully generated minimal perfect hash\n");
if (mph->verbosity)
{
fprintf(stderr, "Successfully generated minimal perfect hash function\n");
}
return mphf;
}
int fch_dump(cmph_t *mphf, FILE *fd)
{
char *buf = NULL;
cmph_uint32 buflen;
register size_t nbytes;
fch_data_t *data = (fch_data_t *)mphf->data;
#ifdef DEBUG
cmph_uint32 i;
#endif
__cmph_dump(mphf, fd);
hash_state_dump(data->h1, &buf, &buflen);
//DEBUGP("Dumping hash state with %u bytes to disk\n", buflen);
nbytes = fwrite(&buflen, sizeof(cmph_uint32), (size_t)1, fd);
nbytes = fwrite(buf, (size_t)buflen, (size_t)1, fd);
free(buf);
hash_state_dump(data->h2, &buf, &buflen);
//DEBUGP("Dumping hash state with %u bytes to disk\n", buflen);
nbytes = fwrite(&buflen, sizeof(cmph_uint32), (size_t)1, fd);
nbytes = fwrite(buf, (size_t)buflen, (size_t)1, fd);
free(buf);
nbytes = fwrite(&(data->m), sizeof(cmph_uint32), (size_t)1, fd);
nbytes = fwrite(&(data->c), sizeof(double), (size_t)1, fd);
nbytes = fwrite(&(data->b), sizeof(cmph_uint32), (size_t)1, fd);
nbytes = fwrite(&(data->p1), sizeof(double), (size_t)1, fd);
nbytes = fwrite(&(data->p2), sizeof(double), (size_t)1, fd);
nbytes = fwrite(data->g, sizeof(cmph_uint32)*(data->b), (size_t)1, fd);
if (nbytes == 0 && ferror(fd)) {
fprintf(stderr, "ERROR: %s\n", strerror(errno));
return 0;
}
#ifdef DEBUG
fprintf(stderr, "G: ");
for (i = 0; i < data->b; ++i) fprintf(stderr, "%u ", data->g[i]);
fprintf(stderr, "\n");
#endif
return 1;
}
void fch_load(FILE *f, cmph_t *mphf)
{
char *buf = NULL;
cmph_uint32 buflen;
register size_t nbytes;
fch_data_t *fch = (fch_data_t *)malloc(sizeof(fch_data_t));
#ifdef DEBUG
cmph_uint32 i;
#endif
//DEBUGP("Loading fch mphf\n");
mphf->data = fch;
//DEBUGP("Reading h1\n");
fch->h1 = NULL;
nbytes = fread(&buflen, sizeof(cmph_uint32), (size_t)1, f);
//DEBUGP("Hash state of h1 has %u bytes\n", buflen);
buf = (char *)malloc((size_t)buflen);
nbytes = fread(buf, (size_t)buflen, (size_t)1, f);
fch->h1 = hash_state_load(buf, buflen);
free(buf);
//DEBUGP("Loading fch mphf\n");
mphf->data = fch;
//DEBUGP("Reading h2\n");
fch->h2 = NULL;
nbytes = fread(&buflen, sizeof(cmph_uint32), (size_t)1, f);
//DEBUGP("Hash state of h2 has %u bytes\n", buflen);
buf = (char *)malloc((size_t)buflen);
nbytes = fread(buf, (size_t)buflen, (size_t)1, f);
fch->h2 = hash_state_load(buf, buflen);
free(buf);
//DEBUGP("Reading m and n\n");
nbytes = fread(&(fch->m), sizeof(cmph_uint32), (size_t)1, f);
nbytes = fread(&(fch->c), sizeof(double), (size_t)1, f);
nbytes = fread(&(fch->b), sizeof(cmph_uint32), (size_t)1, f);
nbytes = fread(&(fch->p1), sizeof(double), (size_t)1, f);
nbytes = fread(&(fch->p2), sizeof(double), (size_t)1, f);
fch->g = (cmph_uint32 *)malloc(sizeof(cmph_uint32)*fch->b);
nbytes = fread(fch->g, fch->b*sizeof(cmph_uint32), (size_t)1, f);
if (nbytes == 0 && ferror(f)) {
fprintf(stderr, "ERROR: %s\n", strerror(errno));
return;
}
#ifdef DEBUG
fprintf(stderr, "G: ");
for (i = 0; i < fch->b; ++i) fprintf(stderr, "%u ", fch->g[i]);
fprintf(stderr, "\n");
#endif
return;
}
cmph_uint32 fch_search(cmph_t *mphf, const char *key, cmph_uint32 keylen)
{
fch_data_t *fch = mphf->data;
cmph_uint32 h1 = hash(fch->h1, key, keylen) % fch->m;
cmph_uint32 h2 = hash(fch->h2, key, keylen) % fch->m;
h1 = mixh10h11h12 (fch->b, fch->p1, fch->p2, h1);
//DEBUGP("key: %s h1: %u h2: %u g[h1]: %u\n", key, h1, h2, fch->g[h1]);
return (h2 + fch->g[h1]) % fch->m;
}
void fch_destroy(cmph_t *mphf)
{
fch_data_t *data = (fch_data_t *)mphf->data;
free(data->g);
hash_state_destroy(data->h1);
hash_state_destroy(data->h2);
free(data);
free(mphf);
}
/** \fn void fch_pack(cmph_t *mphf, void *packed_mphf);
* \brief Support the ability to pack a perfect hash function into a preallocated contiguous memory space pointed by packed_mphf.
* \param mphf pointer to the resulting mphf
* \param packed_mphf pointer to the contiguous memory area used to store the resulting mphf. The size of packed_mphf must be at least cmph_packed_size()
*/
void fch_pack(cmph_t *mphf, void *packed_mphf)
{
fch_data_t *data = (fch_data_t *)mphf->data;
cmph_uint8 * ptr = packed_mphf;
// packing h1 type
CMPH_HASH h1_type = hash_get_type(data->h1);
CMPH_HASH h2_type;
*((cmph_uint32 *) ptr) = h1_type;
ptr += sizeof(cmph_uint32);
// packing h1
hash_state_pack(data->h1, ptr);
ptr += hash_state_packed_size(h1_type);
// packing h2 type
h2_type = hash_get_type(data->h2);
*((cmph_uint32 *) ptr) = h2_type;
ptr += sizeof(cmph_uint32);
// packing h2
hash_state_pack(data->h2, ptr);
ptr += hash_state_packed_size(h2_type);
// packing m
*((cmph_uint32 *) ptr) = data->m;
ptr += sizeof(data->m);
// packing b
*((cmph_uint32 *) ptr) = data->b;
ptr += sizeof(data->b);
// packing p1
*((cmph_uint64 *)ptr) = (cmph_uint64)data->p1;
ptr += sizeof(data->p1);
// packing p2
*((cmph_uint64 *)ptr) = (cmph_uint64)data->p2;
ptr += sizeof(data->p2);
// packing g
memcpy(ptr, data->g, sizeof(cmph_uint32)*(data->b));
}
/** \fn cmph_uint32 fch_packed_size(cmph_t *mphf);
* \brief Return the amount of space needed to pack mphf.
* \param mphf pointer to a mphf
* \return the size of the packed function or zero for failures
*/
cmph_uint32 fch_packed_size(cmph_t *mphf)
{
fch_data_t *data = (fch_data_t *)mphf->data;
CMPH_HASH h1_type = hash_get_type(data->h1);
CMPH_HASH h2_type = hash_get_type(data->h2);
return (cmph_uint32)(sizeof(CMPH_ALGO) + hash_state_packed_size(h1_type) + hash_state_packed_size(h2_type) +
4*sizeof(cmph_uint32) + 2*sizeof(double) + sizeof(cmph_uint32)*(data->b));
}
/** cmph_uint32 fch_search(void *packed_mphf, const char *key, cmph_uint32 keylen);
* \brief Use the packed mphf to do a search.
* \param packed_mphf pointer to the packed mphf
* \param key key to be hashed
* \param keylen key legth in bytes
* \return The mphf value
*/
cmph_uint32 fch_search_packed(void *packed_mphf, const char *key, cmph_uint32 keylen)
{
register cmph_uint8 *h1_ptr = packed_mphf;
register CMPH_HASH h1_type = *((cmph_uint32 *)h1_ptr);
register cmph_uint8 *h2_ptr;
register CMPH_HASH h2_type;
register cmph_uint32 *g_ptr;
register cmph_uint32 m, b, h1, h2;
register double p1, p2;
h1_ptr += 4;
h2_ptr = h1_ptr + hash_state_packed_size(h1_type);
h2_type = *((cmph_uint32 *)h2_ptr);
h2_ptr += 4;
g_ptr = (cmph_uint32 *)(h2_ptr + hash_state_packed_size(h2_type));
m = *g_ptr++;
b = *g_ptr++;
p1 = (double)(*((cmph_uint64 *)g_ptr));
g_ptr += 2;
p2 = (double)(*((cmph_uint64 *)g_ptr));
g_ptr += 2;
h1 = hash_packed(h1_ptr, h1_type, key, keylen) % m;
h2 = hash_packed(h2_ptr, h2_type, key, keylen) % m;
h1 = mixh10h11h12 (b, p1, p2, h1);
return (h2 + g_ptr[h1]) % m;
}

48
girepository/cmph/fch.h Normal file
View File

@ -0,0 +1,48 @@
#ifndef __CMPH_FCH_H__
#define __CMPH_FCH_H__
#include "cmph.h"
typedef struct __fch_data_t fch_data_t;
typedef struct __fch_config_data_t fch_config_data_t;
/* Parameters calculation */
cmph_uint32 fch_calc_b(double c, cmph_uint32 m);
double fch_calc_p1(cmph_uint32 m);
double fch_calc_p2(cmph_uint32 b);
cmph_uint32 mixh10h11h12(cmph_uint32 b, double p1, double p2, cmph_uint32 initial_index);
fch_config_data_t *fch_config_new(void);
void fch_config_set_hashfuncs(cmph_config_t *mph, CMPH_HASH *hashfuncs);
void fch_config_destroy(cmph_config_t *mph);
cmph_t *fch_new(cmph_config_t *mph, double c);
void fch_load(FILE *f, cmph_t *mphf);
int fch_dump(cmph_t *mphf, FILE *f);
void fch_destroy(cmph_t *mphf);
cmph_uint32 fch_search(cmph_t *mphf, const char *key, cmph_uint32 keylen);
/** \fn void fch_pack(cmph_t *mphf, void *packed_mphf);
* \brief Support the ability to pack a perfect hash function into a preallocated contiguous memory space pointed by packed_mphf.
* \param mphf pointer to the resulting mphf
* \param packed_mphf pointer to the contiguous memory area used to store the resulting mphf. The size of packed_mphf must be at least cmph_packed_size()
*/
void fch_pack(cmph_t *mphf, void *packed_mphf);
/** \fn cmph_uint32 fch_packed_size(cmph_t *mphf);
* \brief Return the amount of space needed to pack mphf.
* \param mphf pointer to a mphf
* \return the size of the packed function or zero for failures
*/
cmph_uint32 fch_packed_size(cmph_t *mphf);
/** cmph_uint32 fch_search(void *packed_mphf, const char *key, cmph_uint32 keylen);
* \brief Use the packed mphf to do a search.
* \param packed_mphf pointer to the packed mphf
* \param key key to be hashed
* \param keylen key legth in bytes
* \return The mphf value
*/
cmph_uint32 fch_search_packed(void *packed_mphf, const char *key, cmph_uint32 keylen);
#endif

View File

@ -0,0 +1,214 @@
#include "vqueue.h"
#include "fch_buckets.h"
#include <stdio.h>
#include <assert.h>
#include <stdlib.h>
//#define DEBUG
#include "debug.h"
typedef struct __fch_bucket_entry_t
{
char * value;
cmph_uint32 length;
} fch_bucket_entry_t;
typedef struct __fch_bucket_t
{
fch_bucket_entry_t * entries;
cmph_uint32 capacity, size;
} fch_bucket_t;
static void fch_bucket_new(fch_bucket_t *bucket)
{
assert(bucket);
bucket->size = 0;
bucket->entries = NULL;
bucket->capacity = 0;
}
static void fch_bucket_destroy(fch_bucket_t *bucket)
{
cmph_uint32 i;
assert(bucket);
for (i = 0; i < bucket->size; i++)
{
free((bucket->entries + i)->value);
}
free(bucket->entries);
}
static void fch_bucket_reserve(fch_bucket_t *bucket, cmph_uint32 size)
{
assert(bucket);
if (bucket->capacity < size)
{
cmph_uint32 new_capacity = bucket->capacity + 1;
DEBUGP("Increasing current capacity %u to %u\n", bucket->capacity, size);
while (new_capacity < size)
{
new_capacity *= 2;
}
bucket->entries = (fch_bucket_entry_t *)realloc(bucket->entries, sizeof(fch_bucket_entry_t)*new_capacity);
assert(bucket->entries);
bucket->capacity = new_capacity;
DEBUGP("Increased\n");
}
}
static void fch_bucket_insert(fch_bucket_t *bucket, char *val, cmph_uint32 val_length)
{
assert(bucket);
fch_bucket_reserve(bucket, bucket->size + 1);
(bucket->entries + bucket->size)->value = val;
(bucket->entries + bucket->size)->length = val_length;
++(bucket->size);
}
static cmph_uint8 fch_bucket_is_empty(fch_bucket_t *bucket)
{
assert(bucket);
return (cmph_uint8)(bucket->size == 0);
}
static cmph_uint32 fch_bucket_size(fch_bucket_t *bucket)
{
assert(bucket);
return bucket->size;
}
static char * fch_bucket_get_key(fch_bucket_t *bucket, cmph_uint32 index_key)
{
assert(bucket); assert(index_key < bucket->size);
return (bucket->entries + index_key)->value;
}
static cmph_uint32 fch_bucket_get_length(fch_bucket_t *bucket, cmph_uint32 index_key)
{
assert(bucket); assert(index_key < bucket->size);
return (bucket->entries + index_key)->length;
}
static void fch_bucket_print(fch_bucket_t * bucket, cmph_uint32 index)
{
cmph_uint32 i;
assert(bucket);
fprintf(stderr, "Printing bucket %u ...\n", index);
for (i = 0; i < bucket->size; i++)
{
fprintf(stderr, " key: %s\n", (bucket->entries + i)->value);
}
}
//////////////////////////////////////////////////////////////////////////////////////
struct __fch_buckets_t
{
fch_bucket_t * values;
cmph_uint32 nbuckets, max_size;
};
fch_buckets_t * fch_buckets_new(cmph_uint32 nbuckets)
{
cmph_uint32 i;
fch_buckets_t *buckets = (fch_buckets_t *)malloc(sizeof(fch_buckets_t));
assert(buckets);
buckets->values = (fch_bucket_t *)calloc((size_t)nbuckets, sizeof(fch_bucket_t));
for (i = 0; i < nbuckets; i++) fch_bucket_new(buckets->values + i);
assert(buckets->values);
buckets->nbuckets = nbuckets;
buckets->max_size = 0;
return buckets;
}
cmph_uint8 fch_buckets_is_empty(fch_buckets_t * buckets, cmph_uint32 index)
{
assert(index < buckets->nbuckets);
return fch_bucket_is_empty(buckets->values + index);
}
void fch_buckets_insert(fch_buckets_t * buckets, cmph_uint32 index, char * key, cmph_uint32 length)
{
assert(index < buckets->nbuckets);
fch_bucket_insert(buckets->values + index, key, length);
if (fch_bucket_size(buckets->values + index) > buckets->max_size)
{
buckets->max_size = fch_bucket_size(buckets->values + index);
}
}
cmph_uint32 fch_buckets_get_size(fch_buckets_t * buckets, cmph_uint32 index)
{
assert(index < buckets->nbuckets);
return fch_bucket_size(buckets->values + index);
}
char * fch_buckets_get_key(fch_buckets_t * buckets, cmph_uint32 index, cmph_uint32 index_key)
{
assert(index < buckets->nbuckets);
return fch_bucket_get_key(buckets->values + index, index_key);
}
cmph_uint32 fch_buckets_get_keylength(fch_buckets_t * buckets, cmph_uint32 index, cmph_uint32 index_key)
{
assert(index < buckets->nbuckets);
return fch_bucket_get_length(buckets->values + index, index_key);
}
cmph_uint32 fch_buckets_get_max_size(fch_buckets_t * buckets)
{
return buckets->max_size;
}
cmph_uint32 fch_buckets_get_nbuckets(fch_buckets_t * buckets)
{
return buckets->nbuckets;
}
cmph_uint32 * fch_buckets_get_indexes_sorted_by_size(fch_buckets_t * buckets)
{
cmph_uint32 i = 0;
cmph_uint32 sum = 0, value;
cmph_uint32 *nbuckets_size = (cmph_uint32 *) calloc((size_t)buckets->max_size + 1, sizeof(cmph_uint32));
cmph_uint32 * sorted_indexes = (cmph_uint32 *) calloc((size_t)buckets->nbuckets, sizeof(cmph_uint32));
// collect how many buckets for each size.
for(i = 0; i < buckets->nbuckets; i++) nbuckets_size[fch_bucket_size(buckets->values + i)] ++;
// calculating offset considering a decreasing order of buckets size.
value = nbuckets_size[buckets->max_size];
nbuckets_size[buckets->max_size] = sum;
for(i = (int)buckets->max_size - 1; i >= 0; i--)
{
sum += value;
value = nbuckets_size[i];
nbuckets_size[i] = sum;
}
for(i = 0; i < buckets->nbuckets; i++)
{
sorted_indexes[nbuckets_size[fch_bucket_size(buckets->values + i)]] = (cmph_uint32)i;
nbuckets_size[fch_bucket_size(buckets->values + i)] ++;
}
free(nbuckets_size);
return sorted_indexes;
}
void fch_buckets_print(fch_buckets_t * buckets)
{
cmph_uint32 i;
for (i = 0; i < buckets->nbuckets; i++) fch_bucket_print(buckets->values + i, i);
}
void fch_buckets_destroy(fch_buckets_t * buckets)
{
cmph_uint32 i;
for (i = 0; i < buckets->nbuckets; i++) fch_bucket_destroy(buckets->values + i);
free(buckets->values);
free(buckets);
}

View File

@ -0,0 +1,30 @@
#ifndef __CMPH_FCH_BUCKETS_H__
#define __CMPH_FCH_BUCKETS_H__
#include "cmph_types.h"
typedef struct __fch_buckets_t fch_buckets_t;
fch_buckets_t * fch_buckets_new(cmph_uint32 nbuckets);
cmph_uint8 fch_buckets_is_empty(fch_buckets_t * buckets, cmph_uint32 index);
void fch_buckets_insert(fch_buckets_t * buckets, cmph_uint32 index, char * key, cmph_uint32 length);
cmph_uint32 fch_buckets_get_size(fch_buckets_t * buckets, cmph_uint32 index);
char * fch_buckets_get_key(fch_buckets_t * buckets, cmph_uint32 index, cmph_uint32 index_key);
cmph_uint32 fch_buckets_get_keylength(fch_buckets_t * buckets, cmph_uint32 index, cmph_uint32 index_key);
// returns the size of biggest bucket.
cmph_uint32 fch_buckets_get_max_size(fch_buckets_t * buckets);
// returns the number of buckets.
cmph_uint32 fch_buckets_get_nbuckets(fch_buckets_t * buckets);
cmph_uint32 * fch_buckets_get_indexes_sorted_by_size(fch_buckets_t * buckets);
void fch_buckets_print(fch_buckets_t * buckets);
void fch_buckets_destroy(fch_buckets_t * buckets);
#endif

View File

@ -0,0 +1,30 @@
#ifndef __CMPH_FCH_STRUCTS_H__
#define __CMPH_FCH_STRUCTS_H__
#include "hash_state.h"
struct __fch_data_t
{
cmph_uint32 m; // words count
double c; // constant c
cmph_uint32 b; // parameter b = ceil(c*m/(log(m)/log(2) + 1)). Don't need to be stored
double p1; // constant p1 = ceil(0.6*m). Don't need to be stored
double p2; // constant p2 = ceil(0.3*b). Don't need to be stored
cmph_uint32 *g; // g function.
hash_state_t *h1; // h10 function.
hash_state_t *h2; // h20 function.
};
struct __fch_config_data_t
{
CMPH_HASH hashfuncs[2];
cmph_uint32 m; // words count
double c; // constant c
cmph_uint32 b; // parameter b = ceil(c*m/(log(m)/log(2) + 1)). Don't need to be stored
double p1; // constant p1 = ceil(0.6*m). Don't need to be stored
double p2; // constant p2 = ceil(0.3*b). Don't need to be stored
cmph_uint32 *g; // g function.
hash_state_t *h1; // h10 function.
hash_state_t *h2; // h20 function.
};
#endif

View File

@ -0,0 +1,53 @@
#include "fnv_hash.h"
#include <stdlib.h>
fnv_state_t *fnv_state_new()
{
fnv_state_t *state = (fnv_state_t *)malloc(sizeof(fnv_state_t));
state->hashfunc = CMPH_HASH_FNV;
return state;
}
void fnv_state_destroy(fnv_state_t *state)
{
free(state);
}
cmph_uint32 fnv_hash(fnv_state_t *state, const char *k, cmph_uint32 keylen)
{
const unsigned char *bp = (const unsigned char *)k;
const unsigned char *be = bp + keylen;
static unsigned int hval = 0;
while (bp < be)
{
//hval *= 0x01000193; good for non-gcc compiler
hval += (hval << 1) + (hval << 4) + (hval << 7) + (hval << 8) + (hval << 24); //good for gcc
hval ^= *bp++;
}
return hval;
}
void fnv_state_dump(fnv_state_t *state, char **buf, cmph_uint32 *buflen)
{
*buf = NULL;
*buflen = 0;
return;
}
fnv_state_t * fnv_state_copy(fnv_state_t *src_state)
{
fnv_state_t *dest_state = (fnv_state_t *)malloc(sizeof(fnv_state_t));
dest_state->hashfunc = src_state->hashfunc;
return dest_state;
}
fnv_state_t *fnv_state_load(const char *buf, cmph_uint32 buflen)
{
fnv_state_t *state = (fnv_state_t *)malloc(sizeof(fnv_state_t));
state->hashfunc = CMPH_HASH_FNV;
return state;
}

View File

@ -0,0 +1,18 @@
#ifndef __FNV_HASH_H__
#define __FNV_HASH_H__
#include "hash.h"
typedef struct __fnv_state_t
{
CMPH_HASH hashfunc;
} fnv_state_t;
fnv_state_t *fnv_state_new();
cmph_uint32 fnv_hash(fnv_state_t *state, const char *k, cmph_uint32 keylen);
void fnv_state_dump(fnv_state_t *state, char **buf, cmph_uint32 *buflen);
fnv_state_t *fnv_state_copy(fnv_state_t *src_state);
fnv_state_t *fnv_state_load(const char *buf, cmph_uint32 buflen);
void fnv_state_destroy(fnv_state_t *state);
#endif

338
girepository/cmph/graph.c Normal file
View File

@ -0,0 +1,338 @@
#include "graph.h"
#include <stdio.h>
#include <stdlib.h>
#include <limits.h>
#include <assert.h>
#include <string.h>
#include "vstack.h"
#include "bitbool.h"
//#define DEBUG
#include "debug.h"
/* static const cmph_uint8 bitmask[8] = { 1, 1 << 1, 1 << 2, 1 << 3, 1 << 4, 1 << 5, 1 << 6, 1 << 7 }; */
/* #define GETBIT(array, i) (array[(i) / 8] & bitmask[(i) % 8]) */
/* #define SETBIT(array, i) (array[(i) / 8] |= bitmask[(i) % 8]) */
/* #define UNSETBIT(array, i) (array[(i) / 8] &= (~(bitmask[(i) % 8]))) */
#define abs_edge(e, i) (e % g->nedges + i * g->nedges)
struct __graph_t
{
cmph_uint32 nnodes;
cmph_uint32 nedges;
cmph_uint32 *edges;
cmph_uint32 *first;
cmph_uint32 *next;
cmph_uint8 *critical_nodes; /* included -- Fabiano*/
cmph_uint32 ncritical_nodes; /* included -- Fabiano*/
cmph_uint32 cedges;
int shrinking;
};
static cmph_uint32 EMPTY = UINT_MAX;
graph_t *graph_new(cmph_uint32 nnodes, cmph_uint32 nedges)
{
graph_t *graph = (graph_t *)malloc(sizeof(graph_t));
if (!graph) return NULL;
graph->edges = (cmph_uint32 *)malloc(sizeof(cmph_uint32) * 2 * nedges);
graph->next = (cmph_uint32 *)malloc(sizeof(cmph_uint32) * 2 * nedges);
graph->first = (cmph_uint32 *)malloc(sizeof(cmph_uint32) * nnodes);
graph->critical_nodes = NULL; /* included -- Fabiano*/
graph->ncritical_nodes = 0; /* included -- Fabiano*/
graph->nnodes = nnodes;
graph->nedges = nedges;
graph_clear_edges(graph);
return graph;
}
void graph_destroy(graph_t *graph)
{
DEBUGP("Destroying graph\n");
free(graph->edges);
free(graph->first);
free(graph->next);
free(graph->critical_nodes); /* included -- Fabiano*/
free(graph);
return;
}
void graph_print(graph_t *g)
{
cmph_uint32 i, e;
for (i = 0; i < g->nnodes; ++i)
{
DEBUGP("Printing edges connected to %u\n", i);
e = g->first[i];
if (e != EMPTY)
{
printf("%u -> %u\n", g->edges[abs_edge(e, 0)], g->edges[abs_edge(e, 1)]);
while ((e = g->next[e]) != EMPTY)
{
printf("%u -> %u\n", g->edges[abs_edge(e, 0)], g->edges[abs_edge(e, 1)]);
}
}
}
return;
}
void graph_add_edge(graph_t *g, cmph_uint32 v1, cmph_uint32 v2)
{
cmph_uint32 e = g->cedges;
assert(v1 < g->nnodes);
assert(v2 < g->nnodes);
assert(e < g->nedges);
assert(!g->shrinking);
g->next[e] = g->first[v1];
g->first[v1] = e;
g->edges[e] = v2;
g->next[e + g->nedges] = g->first[v2];
g->first[v2] = e + g->nedges;
g->edges[e + g->nedges] = v1;
++(g->cedges);
}
static int check_edge(graph_t *g, cmph_uint32 e, cmph_uint32 v1, cmph_uint32 v2)
{
DEBUGP("Checking edge %u %u looking for %u %u\n", g->edges[abs_edge(e, 0)], g->edges[abs_edge(e, 1)], v1, v2);
if (g->edges[abs_edge(e, 0)] == v1 && g->edges[abs_edge(e, 1)] == v2) return 1;
if (g->edges[abs_edge(e, 0)] == v2 && g->edges[abs_edge(e, 1)] == v1) return 1;
return 0;
}
cmph_uint32 graph_edge_id(graph_t *g, cmph_uint32 v1, cmph_uint32 v2)
{
cmph_uint32 e;
e = g->first[v1];
assert(e != EMPTY);
if (check_edge(g, e, v1, v2)) return abs_edge(e, 0);
do
{
e = g->next[e];
assert(e != EMPTY);
}
while (!check_edge(g, e, v1, v2));
return abs_edge(e, 0);
}
static void del_edge_point(graph_t *g, cmph_uint32 v1, cmph_uint32 v2)
{
cmph_uint32 e, prev;
DEBUGP("Deleting edge point %u %u\n", v1, v2);
e = g->first[v1];
if (check_edge(g, e, v1, v2))
{
g->first[v1] = g->next[e];
//g->edges[e] = EMPTY;
DEBUGP("Deleted\n");
return;
}
DEBUGP("Checking linked list\n");
do
{
prev = e;
e = g->next[e];
assert(e != EMPTY);
}
while (!check_edge(g, e, v1, v2));
g->next[prev] = g->next[e];
//g->edges[e] = EMPTY;
DEBUGP("Deleted\n");
}
void graph_del_edge(graph_t *g, cmph_uint32 v1, cmph_uint32 v2)
{
g->shrinking = 1;
del_edge_point(g, v1, v2);
del_edge_point(g, v2, v1);
}
void graph_clear_edges(graph_t *g)
{
cmph_uint32 i;
for (i = 0; i < g->nnodes; ++i) g->first[i] = EMPTY;
for (i = 0; i < g->nedges*2; ++i)
{
g->edges[i] = EMPTY;
g->next[i] = EMPTY;
}
g->cedges = 0;
g->shrinking = 0;
}
static cmph_uint8 find_degree1_edge(graph_t *g, cmph_uint32 v, cmph_uint8 *deleted, cmph_uint32 *e)
{
cmph_uint32 edge = g->first[v];
cmph_uint8 found = 0;
DEBUGP("Checking degree of vertex %u\n", v);
if (edge == EMPTY) return 0;
else if (!(GETBIT(deleted, abs_edge(edge, 0))))
{
found = 1;
*e = edge;
}
while(1)
{
edge = g->next[edge];
if (edge == EMPTY) break;
if (GETBIT(deleted, abs_edge(edge, 0))) continue;
if (found) return 0;
DEBUGP("Found first edge\n");
*e = edge;
found = 1;
}
return found;
}
static void cyclic_del_edge(graph_t *g, cmph_uint32 v, cmph_uint8 *deleted)
{
cmph_uint32 e = 0;
cmph_uint8 degree1;
cmph_uint32 v1 = v;
cmph_uint32 v2 = 0;
degree1 = find_degree1_edge(g, v1, deleted, &e);
if (!degree1) return;
while(1)
{
DEBUGP("Deleting edge %u (%u->%u)\n", e, g->edges[abs_edge(e, 0)], g->edges[abs_edge(e, 1)]);
SETBIT(deleted, abs_edge(e, 0));
v2 = g->edges[abs_edge(e, 0)];
if (v2 == v1) v2 = g->edges[abs_edge(e, 1)];
DEBUGP("Checking if second endpoint %u has degree 1\n", v2);
degree1 = find_degree1_edge(g, v2, deleted, &e);
if (degree1)
{
DEBUGP("Inspecting vertex %u\n", v2);
v1 = v2;
}
else break;
}
}
int graph_is_cyclic(graph_t *g)
{
cmph_uint32 i;
cmph_uint32 v;
cmph_uint8 *deleted = (cmph_uint8 *)malloc((g->nedges*sizeof(cmph_uint8))/8 + 1);
size_t deleted_len = g->nedges/8 + 1;
memset(deleted, 0, deleted_len);
DEBUGP("Looking for cycles in graph with %u vertices and %u edges\n", g->nnodes, g->nedges);
for (v = 0; v < g->nnodes; ++v)
{
cyclic_del_edge(g, v, deleted);
}
for (i = 0; i < g->nedges; ++i)
{
if (!(GETBIT(deleted, i)))
{
DEBUGP("Edge %u %u->%u was not deleted\n", i, g->edges[i], g->edges[i + g->nedges]);
free(deleted);
return 1;
}
}
free(deleted);
return 0;
}
cmph_uint8 graph_node_is_critical(graph_t * g, cmph_uint32 v) /* included -- Fabiano */
{
return (cmph_uint8)GETBIT(g->critical_nodes,v);
}
void graph_obtain_critical_nodes(graph_t *g) /* included -- Fabiano*/
{
cmph_uint32 i;
cmph_uint32 v;
cmph_uint8 *deleted = (cmph_uint8 *)malloc((g->nedges*sizeof(cmph_uint8))/8+1);
size_t deleted_len = g->nedges/8 + 1;
memset(deleted, 0, deleted_len);
free(g->critical_nodes);
g->critical_nodes = (cmph_uint8 *)malloc((g->nnodes*sizeof(cmph_uint8))/8 + 1);
g->ncritical_nodes = 0;
memset(g->critical_nodes, 0, (g->nnodes*sizeof(cmph_uint8))/8 + 1);
DEBUGP("Looking for the 2-core in graph with %u vertices and %u edges\n", g->nnodes, g->nedges);
for (v = 0; v < g->nnodes; ++v)
{
cyclic_del_edge(g, v, deleted);
}
for (i = 0; i < g->nedges; ++i)
{
if (!(GETBIT(deleted,i)))
{
DEBUGP("Edge %u %u->%u belongs to the 2-core\n", i, g->edges[i], g->edges[i + g->nedges]);
if(!(GETBIT(g->critical_nodes,g->edges[i])))
{
g->ncritical_nodes ++;
SETBIT(g->critical_nodes,g->edges[i]);
}
if(!(GETBIT(g->critical_nodes,g->edges[i + g->nedges])))
{
g->ncritical_nodes ++;
SETBIT(g->critical_nodes,g->edges[i + g->nedges]);
}
}
}
free(deleted);
}
cmph_uint8 graph_contains_edge(graph_t *g, cmph_uint32 v1, cmph_uint32 v2) /* included -- Fabiano*/
{
cmph_uint32 e;
e = g->first[v1];
if(e == EMPTY) return 0;
if (check_edge(g, e, v1, v2)) return 1;
do
{
e = g->next[e];
if(e == EMPTY) return 0;
}
while (!check_edge(g, e, v1, v2));
return 1;
}
cmph_uint32 graph_vertex_id(graph_t *g, cmph_uint32 e, cmph_uint32 id) /* included -- Fabiano*/
{
return (g->edges[e + id*g->nedges]);
}
cmph_uint32 graph_ncritical_nodes(graph_t *g) /* included -- Fabiano*/
{
return g->ncritical_nodes;
}
graph_iterator_t graph_neighbors_it(graph_t *g, cmph_uint32 v)
{
graph_iterator_t it;
it.vertex = v;
it.edge = g->first[v];
return it;
}
cmph_uint32 graph_next_neighbor(graph_t *g, graph_iterator_t* it)
{
cmph_uint32 ret;
if(it->edge == EMPTY) return GRAPH_NO_NEIGHBOR;
if (g->edges[it->edge] == it->vertex) ret = g->edges[it->edge + g->nedges];
else ret = g->edges[it->edge];
it->edge = g->next[it->edge];
return ret;
}

40
girepository/cmph/graph.h Normal file
View File

@ -0,0 +1,40 @@
#ifndef _CMPH_GRAPH_H__
#define _CMPH_GRAPH_H__
#include <limits.h>
#include "cmph_types.h"
#define GRAPH_NO_NEIGHBOR UINT_MAX
typedef struct __graph_t graph_t;
typedef struct __graph_iterator_t graph_iterator_t;
struct __graph_iterator_t
{
cmph_uint32 vertex;
cmph_uint32 edge;
};
graph_t *graph_new(cmph_uint32 nnodes, cmph_uint32 nedges);
void graph_destroy(graph_t *graph);
void graph_add_edge(graph_t *g, cmph_uint32 v1, cmph_uint32 v2);
void graph_del_edge(graph_t *g, cmph_uint32 v1, cmph_uint32 v2);
void graph_clear_edges(graph_t *g);
cmph_uint32 graph_edge_id(graph_t *g, cmph_uint32 v1, cmph_uint32 v2);
cmph_uint8 graph_contains_edge(graph_t *g, cmph_uint32 v1, cmph_uint32 v2);
graph_iterator_t graph_neighbors_it(graph_t *g, cmph_uint32 v);
cmph_uint32 graph_next_neighbor(graph_t *g, graph_iterator_t* it);
void graph_obtain_critical_nodes(graph_t *g); /* included -- Fabiano*/
cmph_uint8 graph_node_is_critical(graph_t * g, cmph_uint32 v); /* included -- Fabiano */
cmph_uint32 graph_ncritical_nodes(graph_t *g); /* included -- Fabiano*/
cmph_uint32 graph_vertex_id(graph_t *g, cmph_uint32 e, cmph_uint32 id); /* included -- Fabiano*/
int graph_is_cyclic(graph_t *g);
void graph_print(graph_t *);
#endif

216
girepository/cmph/hash.c Normal file
View File

@ -0,0 +1,216 @@
#include "hash_state.h"
#include <stdlib.h>
#include <assert.h>
#include <limits.h>
#include <string.h>
//#define DEBUG
#include "debug.h"
const char *cmph_hash_names[] = { "jenkins", NULL };
hash_state_t *hash_state_new(CMPH_HASH hashfunc, cmph_uint32 hashsize)
{
hash_state_t *state = NULL;
switch (hashfunc)
{
case CMPH_HASH_JENKINS:
DEBUGP("Jenkins function - %u\n", hashsize);
state = (hash_state_t *)jenkins_state_new(hashsize);
DEBUGP("Jenkins function created\n");
break;
default:
assert(0);
}
state->hashfunc = hashfunc;
return state;
}
cmph_uint32 hash(hash_state_t *state, const char *key, cmph_uint32 keylen)
{
switch (state->hashfunc)
{
case CMPH_HASH_JENKINS:
return jenkins_hash((jenkins_state_t *)state, key, keylen);
default:
assert(0);
}
assert(0);
return 0;
}
void hash_vector(hash_state_t *state, const char *key, cmph_uint32 keylen, cmph_uint32 * hashes)
{
switch (state->hashfunc)
{
case CMPH_HASH_JENKINS:
jenkins_hash_vector_((jenkins_state_t *)state, key, keylen, hashes);
break;
default:
assert(0);
}
}
void hash_state_dump(hash_state_t *state, char **buf, cmph_uint32 *buflen)
{
char *algobuf;
size_t len;
switch (state->hashfunc)
{
case CMPH_HASH_JENKINS:
jenkins_state_dump((jenkins_state_t *)state, &algobuf, buflen);
if (*buflen == UINT_MAX) return;
break;
default:
assert(0);
}
*buf = (char *)malloc(strlen(cmph_hash_names[state->hashfunc]) + 1 + *buflen);
memcpy(*buf, cmph_hash_names[state->hashfunc], strlen(cmph_hash_names[state->hashfunc]) + 1);
DEBUGP("Algobuf is %u\n", *(cmph_uint32 *)algobuf);
len = *buflen;
memcpy(*buf + strlen(cmph_hash_names[state->hashfunc]) + 1, algobuf, len);
*buflen = (cmph_uint32)strlen(cmph_hash_names[state->hashfunc]) + 1 + *buflen;
free(algobuf);
return;
}
hash_state_t * hash_state_copy(hash_state_t *src_state)
{
hash_state_t *dest_state = NULL;
switch (src_state->hashfunc)
{
case CMPH_HASH_JENKINS:
dest_state = (hash_state_t *)jenkins_state_copy((jenkins_state_t *)src_state);
break;
default:
assert(0);
}
dest_state->hashfunc = src_state->hashfunc;
return dest_state;
}
hash_state_t *hash_state_load(const char *buf, cmph_uint32 buflen)
{
cmph_uint32 i;
cmph_uint32 offset;
CMPH_HASH hashfunc = CMPH_HASH_COUNT;
for (i = 0; i < CMPH_HASH_COUNT; ++i)
{
if (strcmp(buf, cmph_hash_names[i]) == 0)
{
hashfunc = i;
break;
}
}
if (hashfunc == CMPH_HASH_COUNT) return NULL;
offset = (cmph_uint32)strlen(cmph_hash_names[hashfunc]) + 1;
switch (hashfunc)
{
case CMPH_HASH_JENKINS:
return (hash_state_t *)jenkins_state_load(buf + offset, buflen - offset);
default:
return NULL;
}
return NULL;
}
void hash_state_destroy(hash_state_t *state)
{
switch (state->hashfunc)
{
case CMPH_HASH_JENKINS:
jenkins_state_destroy((jenkins_state_t *)state);
break;
default:
assert(0);
}
return;
}
/** \fn void hash_state_pack(hash_state_t *state, void *hash_packed)
* \brief Support the ability to pack a hash function into a preallocated contiguous memory space pointed by hash_packed.
* \param state points to the hash function
* \param hash_packed pointer to the contiguous memory area used to store the hash function. The size of hash_packed must be at least hash_state_packed_size()
*
* Support the ability to pack a hash function into a preallocated contiguous memory space pointed by hash_packed.
* However, the hash function type must be packed outside.
*/
void hash_state_pack(hash_state_t *state, void *hash_packed)
{
switch (state->hashfunc)
{
case CMPH_HASH_JENKINS:
// pack the jenkins hash function
jenkins_state_pack((jenkins_state_t *)state, hash_packed);
break;
default:
assert(0);
}
return;
}
/** \fn cmph_uint32 hash_state_packed_size(CMPH_HASH hashfunc)
* \brief Return the amount of space needed to pack a hash function.
* \param hashfunc function type
* \return the size of the packed function or zero for failures
*/
cmph_uint32 hash_state_packed_size(CMPH_HASH hashfunc)
{
cmph_uint32 size = 0;
switch (hashfunc)
{
case CMPH_HASH_JENKINS:
size += jenkins_state_packed_size();
break;
default:
assert(0);
}
return size;
}
/** \fn cmph_uint32 hash_packed(void *hash_packed, CMPH_HASH hashfunc, const char *k, cmph_uint32 keylen)
* \param hash_packed is a pointer to a contiguous memory area
* \param hashfunc is the type of the hash function packed in hash_packed
* \param key is a pointer to a key
* \param keylen is the key length
* \return an integer that represents a hash value of 32 bits.
*/
cmph_uint32 hash_packed(void *hash_packed, CMPH_HASH hashfunc, const char *k, cmph_uint32 keylen)
{
switch (hashfunc)
{
case CMPH_HASH_JENKINS:
return jenkins_hash_packed(hash_packed, k, keylen);
default:
assert(0);
}
assert(0);
return 0;
}
/** \fn hash_vector_packed(void *hash_packed, CMPH_HASH hashfunc, const char *k, cmph_uint32 keylen, cmph_uint32 * hashes)
* \param hash_packed is a pointer to a contiguous memory area
* \param key is a pointer to a key
* \param keylen is the key length
* \param hashes is a pointer to a memory large enough to fit three 32-bit integers.
*/
void hash_vector_packed(void *hash_packed, CMPH_HASH hashfunc, const char *k, cmph_uint32 keylen, cmph_uint32 * hashes)
{
switch (hashfunc)
{
case CMPH_HASH_JENKINS:
jenkins_hash_vector_packed(hash_packed, k, keylen, hashes);
break;
default:
assert(0);
}
}
/** \fn CMPH_HASH hash_get_type(hash_state_t *state);
* \param state is a pointer to a hash_state_t structure
* \return the hash function type pointed by state
*/
CMPH_HASH hash_get_type(hash_state_t *state)
{
return state->hashfunc;
}

76
girepository/cmph/hash.h Normal file
View File

@ -0,0 +1,76 @@
#ifndef __CMPH_HASH_H__
#define __CMPH_HASH_H__
#include "cmph_types.h"
typedef union __hash_state_t hash_state_t;
hash_state_t *hash_state_new(CMPH_HASH, cmph_uint32 hashsize);
/** \fn cmph_uint32 hash(hash_state_t *state, const char *key, cmph_uint32 keylen);
* \param state is a pointer to a hash_state_t structure
* \param key is a pointer to a key
* \param keylen is the key length
* \return an integer that represents a hash value of 32 bits.
*/
cmph_uint32 hash(hash_state_t *state, const char *key, cmph_uint32 keylen);
/** \fn void hash_vector(hash_state_t *state, const char *key, cmph_uint32 keylen, cmph_uint32 * hashes);
* \param state is a pointer to a hash_state_t structure
* \param key is a pointer to a key
* \param keylen is the key length
* \param hashes is a pointer to a memory large enough to fit three 32-bit integers.
*/
void hash_vector(hash_state_t *state, const char *key, cmph_uint32 keylen, cmph_uint32 * hashes);
void hash_state_dump(hash_state_t *state, char **buf, cmph_uint32 *buflen);
hash_state_t * hash_state_copy(hash_state_t *src_state);
hash_state_t *hash_state_load(const char *buf, cmph_uint32 buflen);
void hash_state_destroy(hash_state_t *state);
/** \fn void hash_state_pack(hash_state_t *state, void *hash_packed);
* \brief Support the ability to pack a hash function into a preallocated contiguous memory space pointed by hash_packed.
* \param state points to the hash function
* \param hash_packed pointer to the contiguous memory area used to store the hash function. The size of hash_packed must be at least hash_state_packed_size()
*
* Support the ability to pack a hash function into a preallocated contiguous memory space pointed by hash_packed.
* However, the hash function type must be packed outside.
*/
void hash_state_pack(hash_state_t *state, void *hash_packed);
/** \fn cmph_uint32 hash_packed(void *hash_packed, CMPH_HASH hashfunc, const char *k, cmph_uint32 keylen);
* \param hash_packed is a pointer to a contiguous memory area
* \param hashfunc is the type of the hash function packed in hash_packed
* \param key is a pointer to a key
* \param keylen is the key length
* \return an integer that represents a hash value of 32 bits.
*/
cmph_uint32 hash_packed(void *hash_packed, CMPH_HASH hashfunc, const char *k, cmph_uint32 keylen);
/** \fn cmph_uint32 hash_state_packed_size(CMPH_HASH hashfunc)
* \brief Return the amount of space needed to pack a hash function.
* \param hashfunc function type
* \return the size of the packed function or zero for failures
*/
cmph_uint32 hash_state_packed_size(CMPH_HASH hashfunc);
/** \fn hash_vector_packed(void *hash_packed, CMPH_HASH hashfunc, const char *k, cmph_uint32 keylen, cmph_uint32 * hashes);
* \param hash_packed is a pointer to a contiguous memory area
* \param key is a pointer to a key
* \param keylen is the key length
* \param hashes is a pointer to a memory large enough to fit three 32-bit integers.
*/
void hash_vector_packed(void *hash_packed, CMPH_HASH hashfunc, const char *k, cmph_uint32 keylen, cmph_uint32 * hashes);
/** \fn CMPH_HASH hash_get_type(hash_state_t *state);
* \param state is a pointer to a hash_state_t structure
* \return the hash function type pointed by state
*/
CMPH_HASH hash_get_type(hash_state_t *state);
#endif

View File

@ -0,0 +1,12 @@
#ifndef __HASH_STATE_H__
#define __HASH_STATE_H__
#include "hash.h"
#include "jenkins_hash.h"
union __hash_state_t
{
CMPH_HASH hashfunc;
jenkins_state_t jenkins;
};
#endif

View File

@ -0,0 +1,289 @@
#include "graph.h"
#include "hashtree.h"
#include "cmph_structs.h"
#include "hastree_structs.h"
#include "hash.h"
#include "bitbool.h"
#include <math.h>
#include <stdlib.h>
#include <stdio.h>
#include <assert.h>
#include <string.h>
//#define DEBUG
#include "debug.h"
hashtree_config_data_t *hashtree_config_new()
{
hashtree_config_data_t *hashtree;
hashtree = (hashtree_config_data_t *)malloc(sizeof(hashtree_config_data_t));
if (!hashtree) return NULL;
memset(hashtree, 0, sizeof(hashtree_config_data_t));
hashtree->hashfuncs[0] = CMPH_HASH_JENKINS;
hashtree->hashfuncs[1] = CMPH_HASH_JENKINS;
hashtree->hashfuncs[2] = CMPH_HASH_JENKINS;
hashtree->memory = 32 * 1024 * 1024;
return hashtree;
}
void hashtree_config_destroy(cmph_config_t *mph)
{
hashtree_config_data_t *data = (hashtree_config_data_t *)mph->data;
DEBUGP("Destroying algorithm dependent data\n");
free(data);
}
void hashtree_config_set_hashfuncs(cmph_config_t *mph, CMPH_HASH *hashfuncs)
{
hashtree_config_data_t *hashtree = (hashtree_config_data_t *)mph->data;
CMPH_HASH *hashptr = hashfuncs;
cmph_uint32 i = 0;
while(*hashptr != CMPH_HASH_COUNT)
{
if (i >= 3) break; //hashtree only uses three hash functions
hashtree->hashfuncs[i] = *hashptr;
++i, ++hashptr;
}
}
cmph_t *hashtree_new(cmph_config_t *mph, double c)
{
cmph_t *mphf = NULL;
hashtree_data_t *hashtreef = NULL;
cmph_uint32 i;
cmph_uint32 iterations = 20;
cmph_uint8 *visited = NULL;
hashtree_config_data_t *hashtree = (hashtree_config_data_t *)mph->data;
hashtree->m = mph->key_source->nkeys;
hashtree->n = ceil(c * mph->key_source->nkeys);
DEBUGP("m (edges): %u n (vertices): %u c: %f\n", hashtree->m, hashtree->n, c);
hashtree->graph = graph_new(hashtree->n, hashtree->m);
DEBUGP("Created graph\n");
hashtree->hashes = (hash_state_t **)malloc(sizeof(hash_state_t *)*3);
for(i = 0; i < 3; ++i) hashtree->hashes[i] = NULL;
//Mapping step
if (mph->verbosity)
{
fprintf(stderr, "Entering mapping step for mph creation of %u keys with graph sized %u\n", hashtree->m, hashtree->n);
}
while(1)
{
int ok;
hashtree->hashes[0] = hash_state_new(hashtree->hashfuncs[0], hashtree->n);
hashtree->hashes[1] = hash_state_new(hashtree->hashfuncs[1], hashtree->n);
ok = hashtree_gen_edges(mph);
if (!ok)
{
--iterations;
hash_state_destroy(hashtree->hashes[0]);
hashtree->hashes[0] = NULL;
hash_state_destroy(hashtree->hashes[1]);
hashtree->hashes[1] = NULL;
DEBUGP("%u iterations remaining\n", iterations);
if (mph->verbosity)
{
fprintf(stderr, "Acyclic graph creation failure - %u iterations remaining\n", iterations);
}
if (iterations == 0) break;
}
else break;
}
if (iterations == 0)
{
graph_destroy(hashtree->graph);
return NULL;
}
//Assignment step
if (mph->verbosity)
{
fprintf(stderr, "Starting assignment step\n");
}
DEBUGP("Assignment step\n");
visited = (char *)malloc(hashtree->n/8 + 1);
memset(visited, 0, hashtree->n/8 + 1);
free(hashtree->g);
hashtree->g = (cmph_uint32 *)malloc(hashtree->n * sizeof(cmph_uint32));
assert(hashtree->g);
for (i = 0; i < hashtree->n; ++i)
{
if (!GETBIT(visited,i))
{
hashtree->g[i] = 0;
hashtree_traverse(hashtree, visited, i);
}
}
graph_destroy(hashtree->graph);
free(visited);
hashtree->graph = NULL;
mphf = (cmph_t *)malloc(sizeof(cmph_t));
mphf->algo = mph->algo;
hashtreef = (hashtree_data_t *)malloc(sizeof(hashtree_data_t));
hashtreef->g = hashtree->g;
hashtree->g = NULL; //transfer memory ownership
hashtreef->hashes = hashtree->hashes;
hashtree->hashes = NULL; //transfer memory ownership
hashtreef->n = hashtree->n;
hashtreef->m = hashtree->m;
mphf->data = hashtreef;
mphf->size = hashtree->m;
DEBUGP("Successfully generated minimal perfect hash\n");
if (mph->verbosity)
{
fprintf(stderr, "Successfully generated minimal perfect hash function\n");
}
return mphf;
}
static void hashtree_traverse(hashtree_config_data_t *hashtree, cmph_uint8 *visited, cmph_uint32 v)
{
graph_iterator_t it = graph_neighbors_it(hashtree->graph, v);
cmph_uint32 neighbor = 0;
SETBIT(visited,v);
DEBUGP("Visiting vertex %u\n", v);
while((neighbor = graph_next_neighbor(hashtree->graph, &it)) != GRAPH_NO_NEIGHBOR)
{
DEBUGP("Visiting neighbor %u\n", neighbor);
if(GETBIT(visited,neighbor)) continue;
DEBUGP("Visiting neighbor %u\n", neighbor);
DEBUGP("Visiting edge %u->%u with id %u\n", v, neighbor, graph_edge_id(hashtree->graph, v, neighbor));
hashtree->g[neighbor] = graph_edge_id(hashtree->graph, v, neighbor) - hashtree->g[v];
DEBUGP("g is %u (%u - %u mod %u)\n", hashtree->g[neighbor], graph_edge_id(hashtree->graph, v, neighbor), hashtree->g[v], hashtree->m);
hashtree_traverse(hashtree, visited, neighbor);
}
}
static int hashtree_gen_edges(cmph_config_t *mph)
{
cmph_uint32 e;
hashtree_config_data_t *hashtree = (hashtree_config_data_t *)mph->data;
int cycles = 0;
DEBUGP("Generating edges for %u vertices with hash functions %s and %s\n", hashtree->n, cmph_hash_names[hashtree->hashfuncs[0]], cmph_hash_names[hashtree->hashfuncs[1]]);
graph_clear_edges(hashtree->graph);
mph->key_source->rewind(mph->key_source->data);
for (e = 0; e < mph->key_source->nkeys; ++e)
{
cmph_uint32 h1, h2;
cmph_uint32 keylen;
char *key;
mph->key_source->read(mph->key_source->data, &key, &keylen);
h1 = hash(hashtree->hashes[0], key, keylen) % hashtree->n;
h2 = hash(hashtree->hashes[1], key, keylen) % hashtree->n;
if (h1 == h2) if (++h2 >= hashtree->n) h2 = 0;
if (h1 == h2)
{
if (mph->verbosity) fprintf(stderr, "Self loop for key %u\n", e);
mph->key_source->dispose(mph->key_source->data, key, keylen);
return 0;
}
DEBUGP("Adding edge: %u -> %u for key %s\n", h1, h2, key);
mph->key_source->dispose(mph->key_source->data, key, keylen);
graph_add_edge(hashtree->graph, h1, h2);
}
cycles = graph_is_cyclic(hashtree->graph);
if (mph->verbosity && cycles) fprintf(stderr, "Cyclic graph generated\n");
DEBUGP("Looking for cycles: %u\n", cycles);
return ! cycles;
}
int hashtree_dump(cmph_t *mphf, FILE *fd)
{
char *buf = NULL;
cmph_uint32 buflen;
cmph_uint32 two = 2; //number of hash functions
hashtree_data_t *data = (hashtree_data_t *)mphf->data;
__cmph_dump(mphf, fd);
fwrite(&two, sizeof(cmph_uint32), 1, fd);
hash_state_dump(data->hashes[0], &buf, &buflen);
DEBUGP("Dumping hash state with %u bytes to disk\n", buflen);
fwrite(&buflen, sizeof(cmph_uint32), 1, fd);
fwrite(buf, buflen, 1, fd);
free(buf);
hash_state_dump(data->hashes[1], &buf, &buflen);
DEBUGP("Dumping hash state with %u bytes to disk\n", buflen);
fwrite(&buflen, sizeof(cmph_uint32), 1, fd);
fwrite(buf, buflen, 1, fd);
free(buf);
fwrite(&(data->n), sizeof(cmph_uint32), 1, fd);
fwrite(&(data->m), sizeof(cmph_uint32), 1, fd);
fwrite(data->g, sizeof(cmph_uint32)*data->n, 1, fd);
#ifdef DEBUG
fprintf(stderr, "G: ");
for (i = 0; i < data->n; ++i) fprintf(stderr, "%u ", data->g[i]);
fprintf(stderr, "\n");
#endif
return 1;
}
void hashtree_load(FILE *f, cmph_t *mphf)
{
cmph_uint32 nhashes;
char *buf = NULL;
cmph_uint32 buflen;
cmph_uint32 i;
hashtree_data_t *hashtree = (hashtree_data_t *)malloc(sizeof(hashtree_data_t));
DEBUGP("Loading hashtree mphf\n");
mphf->data = hashtree;
fread(&nhashes, sizeof(cmph_uint32), 1, f);
hashtree->hashes = (hash_state_t **)malloc(sizeof(hash_state_t *)*(nhashes + 1));
hashtree->hashes[nhashes] = NULL;
DEBUGP("Reading %u hashes\n", nhashes);
for (i = 0; i < nhashes; ++i)
{
hash_state_t *state = NULL;
fread(&buflen, sizeof(cmph_uint32), 1, f);
DEBUGP("Hash state has %u bytes\n", buflen);
buf = (char *)malloc(buflen);
fread(buf, buflen, 1, f);
state = hash_state_load(buf, buflen);
hashtree->hashes[i] = state;
free(buf);
}
DEBUGP("Reading m and n\n");
fread(&(hashtree->n), sizeof(cmph_uint32), 1, f);
fread(&(hashtree->m), sizeof(cmph_uint32), 1, f);
hashtree->g = (cmph_uint32 *)malloc(sizeof(cmph_uint32)*hashtree->n);
fread(hashtree->g, hashtree->n*sizeof(cmph_uint32), 1, f);
#ifdef DEBUG
fprintf(stderr, "G: ");
for (i = 0; i < hashtree->n; ++i) fprintf(stderr, "%u ", hashtree->g[i]);
fprintf(stderr, "\n");
#endif
return;
}
cmph_uint32 hashtree_search(cmph_t *mphf, const char *key, cmph_uint32 keylen)
{
hashtree_data_t *hashtree = mphf->data;
cmph_uint32 h1 = hash(hashtree->hashes[0], key, keylen) % hashtree->n;
cmph_uint32 h2 = hash(hashtree->hashes[1], key, keylen) % hashtree->n;
DEBUGP("key: %s h1: %u h2: %u\n", key, h1, h2);
if (h1 == h2 && ++h2 >= hashtree->n) h2 = 0;
DEBUGP("key: %s g[h1]: %u g[h2]: %u edges: %u\n", key, hashtree->g[h1], hashtree->g[h2], hashtree->m);
return (hashtree->g[h1] + hashtree->g[h2]) % hashtree->m;
}
void hashtree_destroy(cmph_t *mphf)
{
hashtree_data_t *data = (hashtree_data_t *)mphf->data;
free(data->g);
hash_state_destroy(data->hashes[0]);
hash_state_destroy(data->hashes[1]);
free(data->hashes);
free(data);
free(mphf);
}

View File

@ -0,0 +1,19 @@
#ifndef __CMPH_HASHTREE_H__
#define __CMPH_HASHTREE_H__
#include "cmph.h"
typedef struct __hashtree_data_t hashtree_data_t;
typedef struct __hashtree_config_data_t hashtree_config_data_t;
hashtree_config_data_t *hashtree_config_new();
void hashtree_config_set_hashfuncs(cmph_config_t *mph, CMPH_HASH *hashfuncs);
void hashtree_config_set_leaf_algo(cmph_config_t *mph, CMPH_ALGO leaf_algo);
void hashtree_config_destroy(cmph_config_t *mph);
cmph_t *hashtree_new(cmph_config_t *mph, double c);
void hashtree_load(FILE *f, cmph_t *mphf);
int hashtree_dump(cmph_t *mphf, FILE *f);
void hashtree_destroy(cmph_t *mphf);
cmph_uint32 hashtree_search(cmph_t *mphf, const char *key, cmph_uint32 keylen);
#endif

View File

@ -0,0 +1,32 @@
#ifndef __CMPH_HASHTREE_STRUCTS_H__
#define __CMPH_HASHTREE_STRUCTS_H__
#include "hash_state.h"
struct __hashtree_data_t
{
cmph_uint32 m; //edges (words) count
double c; //constant c
cmph_uint8 *size; //size[i] stores the number of edges represented by g[i]
cmph_uint32 **g;
cmph_uint32 k; //number of components
hash_state_t **h1;
hash_state_t **h2;
hash_state_t *h3;
};
struct __hashtree_config_data_t
{
CMPH_ALGO leaf_algo;
CMPH_HASH hashfuncs[3];
cmph_uint32 m; //edges (words) count
cmph_uint8 *size; //size[i] stores the number of edges represented by g[i]
cmph_uint32 *offset; //offset[i] stores the sum size[0] + ... size[i - 1]
cmph_uint32 k; //number of components
cmph_uint32 memory;
hash_state_t **h1;
hash_state_t **h2;
hash_state_t *h3;
};
#endif

View File

@ -0,0 +1,297 @@
#include "jenkins_hash.h"
#include <stdlib.h>
#ifdef WIN32
#define _USE_MATH_DEFINES //For M_LOG2E
#endif
#include <math.h>
#include <limits.h>
#include <string.h>
//#define DEBUG
#include "debug.h"
#define hashsize(n) ((cmph_uint32)1<<(n))
#define hashmask(n) (hashsize(n)-1)
//#define NM2 /* Define this if you do not want power of 2 table sizes*/
/*
--------------------------------------------------------------------
mix -- mix 3 32-bit values reversibly.
For every delta with one or two bits set, and the deltas of all three
high bits or all three low bits, whether the original value of a,b,c
is almost all zero or is uniformly distributed,
* If mix() is run forward or backward, at least 32 bits in a,b,c
have at least 1/4 probability of changing.
* If mix() is run forward, every bit of c will change between 1/3 and
2/3 of the time. (Well, 22/100 and 78/100 for some 2-bit deltas.)
mix() was built out of 36 single-cycle latency instructions in a
structure that could supported 2x parallelism, like so:
a -= b;
a -= c; x = (c>>13);
b -= c; a ^= x;
b -= a; x = (a<<8);
c -= a; b ^= x;
c -= b; x = (b>>13);
...
Unfortunately, superscalar Pentiums and Sparcs can't take advantage
of that parallelism. They've also turned some of those single-cycle
latency instructions into multi-cycle latency instructions. Still,
this is the fastest good hash I could find. There were about 2^^68
to choose from. I only looked at a billion or so.
--------------------------------------------------------------------
*/
#define mix(a,b,c) \
{ \
a -= b; a -= c; a ^= (c>>13); \
b -= c; b -= a; b ^= (a<<8); \
c -= a; c -= b; c ^= (b>>13); \
a -= b; a -= c; a ^= (c>>12); \
b -= c; b -= a; b ^= (a<<16); \
c -= a; c -= b; c ^= (b>>5); \
a -= b; a -= c; a ^= (c>>3); \
b -= c; b -= a; b ^= (a<<10); \
c -= a; c -= b; c ^= (b>>15); \
}
/*
--------------------------------------------------------------------
hash() -- hash a variable-length key into a 32-bit value
k : the key (the unaligned variable-length array of bytes)
len : the length of the key, counting by bytes
initval : can be any 4-byte value
Returns a 32-bit value. Every bit of the key affects every bit of
the return value. Every 1-bit and 2-bit delta achieves avalanche.
About 6*len+35 instructions.
The best hash table sizes are powers of 2. There is no need to do
mod a prime (mod is sooo slow!). If you need less than 32 bits,
use a bitmask. For example, if you need only 10 bits, do
h = (h & hashmask(10));
In which case, the hash table should have hashsize(10) elements.
If you are hashing n strings (cmph_uint8 **)k, do it like this:
for (i=0, h=0; i<n; ++i) h = hash( k[i], len[i], h);
By Bob Jenkins, 1996. bob_jenkins@burtleburtle.net. You may use this
code any way you wish, private, educational, or commercial. It's free.
See http://burtleburtle.net/bob/hash/evahash.html
Use for hash table lookup, or anything where one collision in 2^^32 is
acceptable. Do NOT use for cryptographic purposes.
--------------------------------------------------------------------
*/
jenkins_state_t *jenkins_state_new(cmph_uint32 size) //size of hash table
{
jenkins_state_t *state = (jenkins_state_t *)malloc(sizeof(jenkins_state_t));
DEBUGP("Initializing jenkins hash\n");
state->seed = ((cmph_uint32)rand() % size);
return state;
}
void jenkins_state_destroy(jenkins_state_t *state)
{
free(state);
}
static inline void __jenkins_hash_vector(cmph_uint32 seed, const char *k, cmph_uint32 keylen, cmph_uint32 * hashes)
{
register cmph_uint32 len, length;
/* Set up the internal state */
length = keylen;
len = length;
hashes[0] = hashes[1] = 0x9e3779b9; /* the golden ratio; an arbitrary value */
hashes[2] = seed; /* the previous hash value - seed in our case */
/*---------------------------------------- handle most of the key */
while (len >= 12)
{
hashes[0] += ((cmph_uint32)k[0] +((cmph_uint32)k[1]<<8) +((cmph_uint32)k[2]<<16) +((cmph_uint32)k[3]<<24));
hashes[1] += ((cmph_uint32)k[4] +((cmph_uint32)k[5]<<8) +((cmph_uint32)k[6]<<16) +((cmph_uint32)k[7]<<24));
hashes[2] += ((cmph_uint32)k[8] +((cmph_uint32)k[9]<<8) +((cmph_uint32)k[10]<<16)+((cmph_uint32)k[11]<<24));
mix(hashes[0],hashes[1],hashes[2]);
k += 12; len -= 12;
}
/*------------------------------------- handle the last 11 bytes */
hashes[2] += length;
switch(len) /* all the case statements fall through */
{
case 11:
hashes[2] +=((cmph_uint32)k[10]<<24);
case 10:
hashes[2] +=((cmph_uint32)k[9]<<16);
case 9 :
hashes[2] +=((cmph_uint32)k[8]<<8);
/* the first byte of hashes[2] is reserved for the length */
case 8 :
hashes[1] +=((cmph_uint32)k[7]<<24);
case 7 :
hashes[1] +=((cmph_uint32)k[6]<<16);
case 6 :
hashes[1] +=((cmph_uint32)k[5]<<8);
case 5 :
hashes[1] +=(cmph_uint8) k[4];
case 4 :
hashes[0] +=((cmph_uint32)k[3]<<24);
case 3 :
hashes[0] +=((cmph_uint32)k[2]<<16);
case 2 :
hashes[0] +=((cmph_uint32)k[1]<<8);
case 1 :
hashes[0] +=(cmph_uint8)k[0];
/* case 0: nothing left to add */
}
mix(hashes[0],hashes[1],hashes[2]);
}
cmph_uint32 jenkins_hash(jenkins_state_t *state, const char *k, cmph_uint32 keylen)
{
cmph_uint32 hashes[3];
__jenkins_hash_vector(state->seed, k, keylen, hashes);
return hashes[2];
/* cmph_uint32 a, b, c;
cmph_uint32 len, length;
// Set up the internal state
length = keylen;
len = length;
a = b = 0x9e3779b9; // the golden ratio; an arbitrary value
c = state->seed; // the previous hash value - seed in our case
// handle most of the key
while (len >= 12)
{
a += (k[0] +((cmph_uint32)k[1]<<8) +((cmph_uint32)k[2]<<16) +((cmph_uint32)k[3]<<24));
b += (k[4] +((cmph_uint32)k[5]<<8) +((cmph_uint32)k[6]<<16) +((cmph_uint32)k[7]<<24));
c += (k[8] +((cmph_uint32)k[9]<<8) +((cmph_uint32)k[10]<<16)+((cmph_uint32)k[11]<<24));
mix(a,b,c);
k += 12; len -= 12;
}
// handle the last 11 bytes
c += length;
switch(len) /// all the case statements fall through
{
case 11:
c +=((cmph_uint32)k[10]<<24);
case 10:
c +=((cmph_uint32)k[9]<<16);
case 9 :
c +=((cmph_uint32)k[8]<<8);
// the first byte of c is reserved for the length
case 8 :
b +=((cmph_uint32)k[7]<<24);
case 7 :
b +=((cmph_uint32)k[6]<<16);
case 6 :
b +=((cmph_uint32)k[5]<<8);
case 5 :
b +=k[4];
case 4 :
a +=((cmph_uint32)k[3]<<24);
case 3 :
a +=((cmph_uint32)k[2]<<16);
case 2 :
a +=((cmph_uint32)k[1]<<8);
case 1 :
a +=k[0];
// case 0: nothing left to add
}
mix(a,b,c);
/// report the result
return c;
*/
}
void jenkins_hash_vector_(jenkins_state_t *state, const char *k, cmph_uint32 keylen, cmph_uint32 * hashes)
{
__jenkins_hash_vector(state->seed, k, keylen, hashes);
}
void jenkins_state_dump(jenkins_state_t *state, char **buf, cmph_uint32 *buflen)
{
*buflen = sizeof(cmph_uint32);
*buf = (char *)malloc(sizeof(cmph_uint32));
if (!*buf)
{
*buflen = UINT_MAX;
return;
}
memcpy(*buf, &(state->seed), sizeof(cmph_uint32));
DEBUGP("Dumped jenkins state with seed %u\n", state->seed);
return;
}
jenkins_state_t *jenkins_state_copy(jenkins_state_t *src_state)
{
jenkins_state_t *dest_state = (jenkins_state_t *)malloc(sizeof(jenkins_state_t));
dest_state->hashfunc = src_state->hashfunc;
dest_state->seed = src_state->seed;
return dest_state;
}
jenkins_state_t *jenkins_state_load(const char *buf, cmph_uint32 buflen)
{
jenkins_state_t *state = (jenkins_state_t *)malloc(sizeof(jenkins_state_t));
state->seed = *(cmph_uint32 *)buf;
state->hashfunc = CMPH_HASH_JENKINS;
DEBUGP("Loaded jenkins state with seed %u\n", state->seed);
return state;
}
/** \fn void jenkins_state_pack(jenkins_state_t *state, void *jenkins_packed);
* \brief Support the ability to pack a jenkins function into a preallocated contiguous memory space pointed by jenkins_packed.
* \param state points to the jenkins function
* \param jenkins_packed pointer to the contiguous memory area used to store the jenkins function. The size of jenkins_packed must be at least jenkins_state_packed_size()
*/
void jenkins_state_pack(jenkins_state_t *state, void *jenkins_packed)
{
if (state && jenkins_packed)
{
memcpy(jenkins_packed, &(state->seed), sizeof(cmph_uint32));
}
}
/** \fn cmph_uint32 jenkins_state_packed_size(jenkins_state_t *state);
* \brief Return the amount of space needed to pack a jenkins function.
* \return the size of the packed function or zero for failures
*/
cmph_uint32 jenkins_state_packed_size(void)
{
return sizeof(cmph_uint32);
}
/** \fn cmph_uint32 jenkins_hash_packed(void *jenkins_packed, const char *k, cmph_uint32 keylen);
* \param jenkins_packed is a pointer to a contiguous memory area
* \param key is a pointer to a key
* \param keylen is the key length
* \return an integer that represents a hash value of 32 bits.
*/
cmph_uint32 jenkins_hash_packed(void *jenkins_packed, const char *k, cmph_uint32 keylen)
{
cmph_uint32 hashes[3];
__jenkins_hash_vector(*((cmph_uint32 *)jenkins_packed), k, keylen, hashes);
return hashes[2];
}
/** \fn jenkins_hash_vector_packed(void *jenkins_packed, const char *k, cmph_uint32 keylen, cmph_uint32 * hashes);
* \param jenkins_packed is a pointer to a contiguous memory area
* \param key is a pointer to a key
* \param keylen is the key length
* \param hashes is a pointer to a memory large enough to fit three 32-bit integers.
*/
void jenkins_hash_vector_packed(void *jenkins_packed, const char *k, cmph_uint32 keylen, cmph_uint32 * hashes)
{
__jenkins_hash_vector(*((cmph_uint32 *)jenkins_packed), k, keylen, hashes);
}

View File

@ -0,0 +1,65 @@
#ifndef __JEKINS_HASH_H__
#define __JEKINS_HASH_H__
#include "hash.h"
typedef struct __jenkins_state_t
{
CMPH_HASH hashfunc;
cmph_uint32 seed;
} jenkins_state_t;
jenkins_state_t *jenkins_state_new(cmph_uint32 size); //size of hash table
/** \fn cmph_uint32 jenkins_hash(jenkins_state_t *state, const char *k, cmph_uint32 keylen);
* \param state is a pointer to a jenkins_state_t structure
* \param key is a pointer to a key
* \param keylen is the key length
* \return an integer that represents a hash value of 32 bits.
*/
cmph_uint32 jenkins_hash(jenkins_state_t *state, const char *k, cmph_uint32 keylen);
/** \fn void jenkins_hash_vector_(jenkins_state_t *state, const char *k, cmph_uint32 keylen, cmph_uint32 * hashes);
* \param state is a pointer to a jenkins_state_t structure
* \param key is a pointer to a key
* \param keylen is the key length
* \param hashes is a pointer to a memory large enough to fit three 32-bit integers.
*/
void jenkins_hash_vector_(jenkins_state_t *state, const char *k, cmph_uint32 keylen, cmph_uint32 * hashes);
void jenkins_state_dump(jenkins_state_t *state, char **buf, cmph_uint32 *buflen);
jenkins_state_t *jenkins_state_copy(jenkins_state_t *src_state);
jenkins_state_t *jenkins_state_load(const char *buf, cmph_uint32 buflen);
void jenkins_state_destroy(jenkins_state_t *state);
/** \fn void jenkins_state_pack(jenkins_state_t *state, void *jenkins_packed);
* \brief Support the ability to pack a jenkins function into a preallocated contiguous memory space pointed by jenkins_packed.
* \param state points to the jenkins function
* \param jenkins_packed pointer to the contiguous memory area used to store the jenkins function. The size of jenkins_packed must be at least jenkins_state_packed_size()
*/
void jenkins_state_pack(jenkins_state_t *state, void *jenkins_packed);
/** \fn cmph_uint32 jenkins_state_packed_size();
* \brief Return the amount of space needed to pack a jenkins function.
* \return the size of the packed function or zero for failures
*/
cmph_uint32 jenkins_state_packed_size(void);
/** \fn cmph_uint32 jenkins_hash_packed(void *jenkins_packed, const char *k, cmph_uint32 keylen);
* \param jenkins_packed is a pointer to a contiguous memory area
* \param key is a pointer to a key
* \param keylen is the key length
* \return an integer that represents a hash value of 32 bits.
*/
cmph_uint32 jenkins_hash_packed(void *jenkins_packed, const char *k, cmph_uint32 keylen);
/** \fn jenkins_hash_vector_packed(void *jenkins_packed, const char *k, cmph_uint32 keylen, cmph_uint32 * hashes);
* \param jenkins_packed is a pointer to a contiguous memory area
* \param key is a pointer to a key
* \param keylen is the key length
* \param hashes is a pointer to a memory large enough to fit three 32-bit integers.
*/
void jenkins_hash_vector_packed(void *jenkins_packed, const char *k, cmph_uint32 keylen, cmph_uint32 * hashes);
#endif

342
girepository/cmph/main.c Normal file
View File

@ -0,0 +1,342 @@
#ifdef WIN32
#include "wingetopt.h"
#else
#include <getopt.h>
#endif
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <string.h>
#include <time.h>
#include <limits.h>
#include <assert.h>
#include "cmph.h"
#include "hash.h"
#ifdef WIN32
#define VERSION "0.8"
#else
#include "config.h"
#endif
void usage(const char *prg)
{
fprintf(stderr, "usage: %s [-v] [-h] [-V] [-k nkeys] [-f hash_function] [-g [-c algorithm_dependent_value][-s seed] ] [-a algorithm] [-M memory_in_MB] [-b algorithm_dependent_value] [-t keys_per_bin] [-d tmp_dir] [-m file.mph] keysfile\n", prg);
}
void usage_long(const char *prg)
{
cmph_uint32 i;
fprintf(stderr, "usage: %s [-v] [-h] [-V] [-k nkeys] [-f hash_function] [-g [-c algorithm_dependent_value][-s seed] ] [-a algorithm] [-M memory_in_MB] [-b algorithm_dependent_value] [-t keys_per_bin] [-d tmp_dir] [-m file.mph] keysfile\n", prg);
fprintf(stderr, "Minimum perfect hashing tool\n\n");
fprintf(stderr, " -h\t print this help message\n");
fprintf(stderr, " -c\t c value determines:\n");
fprintf(stderr, " \t * the number of vertices in the graph for the algorithms BMZ and CHM\n");
fprintf(stderr, " \t * the number of bits per key required in the FCH algorithm\n");
fprintf(stderr, " \t * the load factor in the CHD_PH algorithm\n");
fprintf(stderr, " -a\t algorithm - valid values are\n");
for (i = 0; i < CMPH_COUNT; ++i) fprintf(stderr, " \t * %s\n", cmph_names[i]);
fprintf(stderr, " -f\t hash function (may be used multiple times) - valid values are\n");
for (i = 0; i < CMPH_HASH_COUNT; ++i) fprintf(stderr, " \t * %s\n", cmph_hash_names[i]);
fprintf(stderr, " -V\t print version number and exit\n");
fprintf(stderr, " -v\t increase verbosity (may be used multiple times)\n");
fprintf(stderr, " -k\t number of keys\n");
fprintf(stderr, " -g\t generation mode\n");
fprintf(stderr, " -s\t random seed\n");
fprintf(stderr, " -m\t minimum perfect hash function file \n");
fprintf(stderr, " -M\t main memory availability (in MB) used in BRZ algorithm \n");
fprintf(stderr, " -d\t temporary directory used in BRZ algorithm \n");
fprintf(stderr, " -b\t the meaning of this parameter depends on the algorithm selected in the -a option:\n");
fprintf(stderr, " \t * For BRZ it is used to make the maximal number of keys in a bucket lower than 256.\n");
fprintf(stderr, " \t In this case its value should be an integer in the range [64,175]. Default is 128.\n\n");
fprintf(stderr, " \t * For BDZ it is used to determine the size of some precomputed rank\n");
fprintf(stderr, " \t information and its value should be an integer in the range [3,10]. Default\n");
fprintf(stderr, " \t is 7. The larger is this value, the more compact are the resulting functions\n");
fprintf(stderr, " \t and the slower are them at evaluation time.\n\n");
fprintf(stderr, " \t * For CHD and CHD_PH it is used to set the average number of keys per bucket\n");
fprintf(stderr, " \t and its value should be an integer in the range [1,32]. Default is 4. The\n");
fprintf(stderr, " \t larger is this value, the slower is the construction of the functions.\n");
fprintf(stderr, " \t This parameter has no effect for other algorithms.\n\n");
fprintf(stderr, " -t\t set the number of keys per bin for a t-perfect hashing function. A t-perfect\n");
fprintf(stderr, " \t hash function allows at most t collisions in a given bin. This parameter applies\n");
fprintf(stderr, " \t only to the CHD and CHD_PH algorithms. Its value should be an integer in the\n");
fprintf(stderr, " \t range [1,128]. Defaul is 1\n");
fprintf(stderr, " keysfile\t line separated file with keys\n");
}
int main(int argc, char **argv)
{
cmph_uint32 verbosity = 0;
char generate = 0;
char *mphf_file = NULL;
FILE *mphf_fd = stdout;
const char *keys_file = NULL;
FILE *keys_fd;
cmph_uint32 nkeys = UINT_MAX;
cmph_uint32 seed = UINT_MAX;
CMPH_HASH *hashes = NULL;
cmph_uint32 nhashes = 0;
cmph_uint32 i;
CMPH_ALGO mph_algo = CMPH_CHM;
double c = 0;
cmph_config_t *config = NULL;
cmph_t *mphf = NULL;
char * tmp_dir = NULL;
cmph_io_adapter_t *source;
cmph_uint32 memory_availability = 0;
cmph_uint32 b = 0;
cmph_uint32 keys_per_bin = 1;
while (1)
{
char ch = (char)getopt(argc, argv, "hVvgc:k:a:M:b:t:f:m:d:s:");
if (ch == -1) break;
switch (ch)
{
case 's':
{
char *cptr;
seed = (cmph_uint32)strtoul(optarg, &cptr, 10);
if(*cptr != 0) {
fprintf(stderr, "Invalid seed %s\n", optarg);
exit(1);
}
}
break;
case 'c':
{
char *endptr;
c = strtod(optarg, &endptr);
if(*endptr != 0) {
fprintf(stderr, "Invalid c value %s\n", optarg);
exit(1);
}
}
break;
case 'g':
generate = 1;
break;
case 'k':
{
char *endptr;
nkeys = (cmph_uint32)strtoul(optarg, &endptr, 10);
if(*endptr != 0) {
fprintf(stderr, "Invalid number of keys %s\n", optarg);
exit(1);
}
}
break;
case 'm':
mphf_file = strdup(optarg);
break;
case 'd':
tmp_dir = strdup(optarg);
break;
case 'M':
{
char *cptr;
memory_availability = (cmph_uint32)strtoul(optarg, &cptr, 10);
if(*cptr != 0) {
fprintf(stderr, "Invalid memory availability %s\n", optarg);
exit(1);
}
}
break;
case 'b':
{
char *cptr;
b = (cmph_uint32)strtoul(optarg, &cptr, 10);
if(*cptr != 0) {
fprintf(stderr, "Parameter b was not found: %s\n", optarg);
exit(1);
}
}
break;
case 't':
{
char *cptr;
keys_per_bin = (cmph_uint32)strtoul(optarg, &cptr, 10);
if(*cptr != 0) {
fprintf(stderr, "Parameter t was not found: %s\n", optarg);
exit(1);
}
}
break;
case 'v':
++verbosity;
break;
case 'V':
printf("%s\n", VERSION);
return 0;
case 'h':
usage_long(argv[0]);
return 0;
case 'a':
{
char valid = 0;
for (i = 0; i < CMPH_COUNT; ++i)
{
if (strcmp(cmph_names[i], optarg) == 0)
{
mph_algo = i;
valid = 1;
break;
}
}
if (!valid)
{
fprintf(stderr, "Invalid mph algorithm: %s. It is not available in version %s\n", optarg, VERSION);
return -1;
}
}
break;
case 'f':
{
char valid = 0;
for (i = 0; i < CMPH_HASH_COUNT; ++i)
{
if (strcmp(cmph_hash_names[i], optarg) == 0)
{
hashes = (CMPH_HASH *)realloc(hashes, sizeof(CMPH_HASH) * ( nhashes + 2 ));
hashes[nhashes] = i;
hashes[nhashes + 1] = CMPH_HASH_COUNT;
++nhashes;
valid = 1;
break;
}
}
if (!valid)
{
fprintf(stderr, "Invalid hash function: %s\n", optarg);
return -1;
}
}
break;
default:
usage(argv[0]);
return 1;
}
}
if (optind != argc - 1)
{
usage(argv[0]);
return 1;
}
keys_file = argv[optind];
if (seed == UINT_MAX) seed = (cmph_uint32)time(NULL);
srand(seed);
int ret = 0;
if (mphf_file == NULL)
{
mphf_file = (char *)malloc(strlen(keys_file) + 5);
memcpy(mphf_file, keys_file, strlen(keys_file));
memcpy(mphf_file + strlen(keys_file), ".mph\0", (size_t)5);
}
keys_fd = fopen(keys_file, "r");
if (keys_fd == NULL)
{
fprintf(stderr, "Unable to open file %s: %s\n", keys_file, strerror(errno));
return -1;
}
if (seed == UINT_MAX) seed = (cmph_uint32)time(NULL);
if(nkeys == UINT_MAX) source = cmph_io_nlfile_adapter(keys_fd);
else source = cmph_io_nlnkfile_adapter(keys_fd, nkeys);
if (generate)
{
//Create mphf
mphf_fd = fopen(mphf_file, "w");
config = cmph_config_new(source);
cmph_config_set_algo(config, mph_algo);
if (nhashes) cmph_config_set_hashfuncs(config, hashes);
cmph_config_set_verbosity(config, verbosity);
cmph_config_set_tmp_dir(config, (cmph_uint8 *) tmp_dir);
cmph_config_set_mphf_fd(config, mphf_fd);
cmph_config_set_memory_availability(config, memory_availability);
cmph_config_set_b(config, b);
cmph_config_set_keys_per_bin(config, keys_per_bin);
//if((mph_algo == CMPH_BMZ || mph_algo == CMPH_BRZ) && c >= 2.0) c=1.15;
if(mph_algo == CMPH_BMZ && c >= 2.0) c=1.15;
if (c != 0) cmph_config_set_graphsize(config, c);
mphf = cmph_new(config);
cmph_config_destroy(config);
if (mphf == NULL)
{
fprintf(stderr, "Unable to create minimum perfect hashing function\n");
//cmph_config_destroy(config);
free(mphf_file);
return -1;
}
if (mphf_fd == NULL)
{
fprintf(stderr, "Unable to open output file %s: %s\n", mphf_file, strerror(errno));
free(mphf_file);
return -1;
}
cmph_dump(mphf, mphf_fd);
cmph_destroy(mphf);
fclose(mphf_fd);
}
else
{
cmph_uint8 * hashtable = NULL;
mphf_fd = fopen(mphf_file, "r");
if (mphf_fd == NULL)
{
fprintf(stderr, "Unable to open input file %s: %s\n", mphf_file, strerror(errno));
free(mphf_file);
return -1;
}
mphf = cmph_load(mphf_fd);
fclose(mphf_fd);
if (!mphf)
{
fprintf(stderr, "Unable to parser input file %s\n", mphf_file);
free(mphf_file);
return -1;
}
cmph_uint32 siz = cmph_size(mphf);
hashtable = (cmph_uint8*)calloc(siz, sizeof(cmph_uint8));
memset(hashtable, 0,(size_t) siz);
//check all keys
for (i = 0; i < source->nkeys; ++i)
{
cmph_uint32 h;
char *buf;
cmph_uint32 buflen = 0;
source->read(source->data, &buf, &buflen);
h = cmph_search(mphf, buf, buflen);
if (!(h < siz))
{
fprintf(stderr, "Unknown key %*s in the input.\n", buflen, buf);
ret = 1;
} else if(hashtable[h] >= keys_per_bin)
{
fprintf(stderr, "More than %u keys were mapped to bin %u\n", keys_per_bin, h);
fprintf(stderr, "Duplicated or unknown key %*s in the input\n", buflen, buf);
ret = 1;
} else hashtable[h]++;
if (verbosity)
{
printf("%s -> %u\n", buf, h);
}
source->dispose(source->data, buf, buflen);
}
cmph_destroy(mphf);
free(hashtable);
}
fclose(keys_fd);
free(mphf_file);
free(tmp_dir);
cmph_io_nlfile_adapter_destroy(source);
return ret;
}

View File

@ -0,0 +1,77 @@
cmph_sources = [
'bdz.c',
'bdz_ph.c',
'bmz8.c',
'bmz.c',
'brz.c',
'buffer_entry.c',
'buffer_manager.c',
'chd.c',
'chd_ph.c',
'chm.c',
'cmph.c',
'cmph_structs.c',
'compressed_rank.c',
'compressed_seq.c',
'fch_buckets.c',
'fch.c',
'graph.c',
'hash.c',
'jenkins_hash.c',
'miller_rabin.c',
'select.c',
'vqueue.c',
'vstack.c',
]
cmph_deps = [
libglib_dep,
libgobject_dep,
libm,
]
custom_c_args = []
if cc.get_id() != 'msvc'
custom_c_args = cc.get_supported_arguments([
'-Wno-implicit-fallthrough',
'-Wno-old-style-definition',
'-Wno-suggest-attribute=noreturn',
'-Wno-type-limits',
'-Wno-undef',
'-Wno-unused-parameter',
'-Wno-cast-align',
'-Wno-unused-function',
'-Wno-return-type',
'-Wno-sometimes-uninitialized',
])
endif
cmph = static_library('cmph',
sources: cmph_sources,
c_args: custom_c_args,
dependencies: cmph_deps,
)
cmph_dep = declare_dependency(
link_with: cmph,
include_directories: include_directories('.'),
)
if cc.get_id() != 'msvc'
custom_c_args = cc.get_supported_arguments([
'-Wno-old-style-definition',
'-Wno-type-limits',
])
endif
cmph_test = executable('cmph-bdz-test', '../cmph-bdz-test.c',
dependencies: [
cmph_dep,
libglib_dep,
libgobject_dep,
],
c_args: custom_c_args,
)
test('cmph-bdz-test', cmph_test)

View File

@ -0,0 +1,67 @@
#include "miller_rabin.h"
static inline cmph_uint64 int_pow(cmph_uint64 a, cmph_uint64 d, cmph_uint64 n)
{
cmph_uint64 a_pow = a;
cmph_uint64 res = 1;
while(d > 0)
{
if((d & 1) == 1)
res =(((cmph_uint64)res) * a_pow) % n;
a_pow = (((cmph_uint64)a_pow) * a_pow) % n;
d /= 2;
};
return res;
};
static inline cmph_uint8 check_witness(cmph_uint64 a_exp_d, cmph_uint64 n, cmph_uint64 s)
{
cmph_uint64 i;
cmph_uint64 a_exp = a_exp_d;
if(a_exp == 1 || a_exp == (n - 1))
return 1;
for(i = 1; i < s; i++)
{
a_exp = (((cmph_uint64)a_exp) * a_exp) % n;
if(a_exp == (n - 1))
return 1;
};
return 0;
};
cmph_uint8 check_primality(cmph_uint64 n)
{
cmph_uint64 a, d, s, a_exp_d;
if((n % 2) == 0)
return 0;
if((n % 3) == 0)
return 0;
if((n % 5) == 0)
return 0;
if((n % 7 ) == 0)
return 0;
//we decompoe the number n - 1 into 2^s*d
s = 0;
d = n - 1;
do
{
s++;
d /= 2;
}while((d % 2) == 0);
a = 2;
a_exp_d = int_pow(a, d, n);
if(check_witness(a_exp_d, n, s) == 0)
return 0;
a = 7;
a_exp_d = int_pow(a, d, n);
if(check_witness(a_exp_d, n, s) == 0)
return 0;
a = 61;
a_exp_d = int_pow(a, d, n);
if(check_witness(a_exp_d, n, s) == 0)
return 0;
return 1;
};

View File

@ -0,0 +1,5 @@
#ifndef _CMPH_MILLER_RABIN_H__
#define _CMPH_MILLER_RABIN_H__
#include "cmph_types.h"
cmph_uint8 check_primality(cmph_uint64 n);
#endif

View File

@ -0,0 +1,49 @@
#include "sdbm_hash.h"
#include <stdlib.h>
sdbm_state_t *sdbm_state_new()
{
sdbm_state_t *state = (sdbm_state_t *)malloc(sizeof(sdbm_state_t));
state->hashfunc = CMPH_HASH_SDBM;
return state;
}
void sdbm_state_destroy(sdbm_state_t *state)
{
free(state);
}
cmph_uint32 sdbm_hash(sdbm_state_t *state, const char *k, cmph_uint32 keylen)
{
register cmph_uint32 hash = 0;
const unsigned char *ptr = (unsigned char *)k;
cmph_uint32 i = 0;
while(i < keylen) {
hash = *ptr + (hash << 6) + (hash << 16) - hash;
++ptr, ++i;
}
return hash;
}
void sdbm_state_dump(sdbm_state_t *state, char **buf, cmph_uint32 *buflen)
{
*buf = NULL;
*buflen = 0;
return;
}
sdbm_state_t *sdbm_state_copy(sdbm_state_t *src_state)
{
sdbm_state_t *dest_state = (sdbm_state_t *)malloc(sizeof(sdbm_state_t));
dest_state->hashfunc = src_state->hashfunc;
return dest_state;
}
sdbm_state_t *sdbm_state_load(const char *buf, cmph_uint32 buflen)
{
sdbm_state_t *state = (sdbm_state_t *)malloc(sizeof(sdbm_state_t));
state->hashfunc = CMPH_HASH_SDBM;
return state;
}

View File

@ -0,0 +1,18 @@
#ifndef __SDBM_HASH_H__
#define __SDBM_HASH_H__
#include "hash.h"
typedef struct __sdbm_state_t
{
CMPH_HASH hashfunc;
} sdbm_state_t;
sdbm_state_t *sdbm_state_new();
cmph_uint32 sdbm_hash(sdbm_state_t *state, const char *k, cmph_uint32 keylen);
void sdbm_state_dump(sdbm_state_t *state, char **buf, cmph_uint32 *buflen);
sdbm_state_t *sdbm_state_copy(sdbm_state_t *src_state);
sdbm_state_t *sdbm_state_load(const char *buf, cmph_uint32 buflen);
void sdbm_state_destroy(sdbm_state_t *state);
#endif

337
girepository/cmph/select.c Normal file
View File

@ -0,0 +1,337 @@
#include<stdlib.h>
#include<stdio.h>
#include <assert.h>
#include <string.h>
#include <limits.h>
#include "select_lookup_tables.h"
#include "select.h"
//#define DEBUG
#include "debug.h"
#ifndef STEP_SELECT_TABLE
#define STEP_SELECT_TABLE 128
#endif
#ifndef NBITS_STEP_SELECT_TABLE
#define NBITS_STEP_SELECT_TABLE 7
#endif
#ifndef MASK_STEP_SELECT_TABLE
#define MASK_STEP_SELECT_TABLE 0x7f // 0x7f = 127
#endif
static inline void select_insert_0(cmph_uint32 * buffer)
{
(*buffer) >>= 1;
};
static inline void select_insert_1(cmph_uint32 * buffer)
{
(*buffer) >>= 1;
(*buffer) |= 0x80000000;
};
void select_init(select_t * sel)
{
sel->n = 0;
sel->m = 0;
sel->bits_vec = 0;
sel->select_table = 0;
};
cmph_uint32 select_get_space_usage(select_t * sel)
{
register cmph_uint32 nbits;
register cmph_uint32 vec_size;
register cmph_uint32 sel_table_size;
register cmph_uint32 space_usage;
nbits = sel->n + sel->m;
vec_size = (nbits + 31) >> 5;
sel_table_size = (sel->n >> NBITS_STEP_SELECT_TABLE) + 1; // (sel->n >> NBITS_STEP_SELECT_TABLE) = (sel->n/STEP_SELECT_TABLE)
space_usage = 2 * sizeof(cmph_uint32) * 8; // n and m
space_usage += vec_size * (cmph_uint32) sizeof(cmph_uint32) * 8;
space_usage += sel_table_size * (cmph_uint32)sizeof(cmph_uint32) * 8;
return space_usage;
}
void select_destroy(select_t * sel)
{
free(sel->bits_vec);
free(sel->select_table);
sel->bits_vec = 0;
sel->select_table = 0;
};
static inline void select_generate_sel_table(select_t * sel)
{
register cmph_uint8 * bits_table = (cmph_uint8 *)sel->bits_vec;
register cmph_uint32 part_sum, old_part_sum;
register cmph_uint32 vec_idx, one_idx, sel_table_idx;
part_sum = vec_idx = one_idx = sel_table_idx = 0;
for(;;)
{
// FABIANO: Should'n it be one_idx >= sel->n
if(one_idx >= sel->n)
break;
do
{
old_part_sum = part_sum;
part_sum += rank_lookup_table[bits_table[vec_idx]];
vec_idx++;
} while (part_sum <= one_idx);
sel->select_table[sel_table_idx] = select_lookup_table[bits_table[vec_idx - 1]][one_idx - old_part_sum] + ((vec_idx - 1) << 3); // ((vec_idx - 1) << 3) = ((vec_idx - 1) * 8)
one_idx += STEP_SELECT_TABLE ;
sel_table_idx++;
};
};
void select_generate(select_t * sel, cmph_uint32 * keys_vec, cmph_uint32 n, cmph_uint32 m)
{
register cmph_uint32 i, j, idx;
cmph_uint32 buffer = 0;
register cmph_uint32 nbits;
register cmph_uint32 vec_size;
register cmph_uint32 sel_table_size;
sel->n = n;
sel->m = m; // n values in the range [0,m-1]
nbits = sel->n + sel->m;
vec_size = (nbits + 31) >> 5; // (nbits + 31) >> 5 = (nbits + 31)/32
sel_table_size = (sel->n >> NBITS_STEP_SELECT_TABLE) + 1; // (sel->n >> NBITS_STEP_SELECT_TABLE) = (sel->n/STEP_SELECT_TABLE)
if(sel->bits_vec)
{
free(sel->bits_vec);
}
sel->bits_vec = (cmph_uint32 *)calloc(vec_size, sizeof(cmph_uint32));
if(sel->select_table)
{
free(sel->select_table);
}
sel->select_table = (cmph_uint32 *)calloc(sel_table_size, sizeof(cmph_uint32));
idx = i = j = 0;
for(;;)
{
while(keys_vec[j]==i)
{
select_insert_1(&buffer);
idx++;
if((idx & 0x1f) == 0 ) // (idx & 0x1f) = idx % 32
sel->bits_vec[(idx >> 5) - 1] = buffer; // (idx >> 5) = idx/32
j++;
if(j == sel->n)
goto loop_end;
//assert(keys_vec[j] < keys_vec[j-1]);
}
if(i == sel->m)
break;
while(keys_vec[j] > i)
{
select_insert_0(&buffer);
idx++;
if((idx & 0x1f) == 0 ) // (idx & 0x1f) = idx % 32
sel->bits_vec[(idx >> 5) - 1] = buffer; // (idx >> 5) = idx/32
i++;
};
};
loop_end:
if((idx & 0x1f) != 0 ) // (idx & 0x1f) = idx % 32
{
buffer >>= 32 - (idx & 0x1f);
sel->bits_vec[ (idx - 1) >> 5 ] = buffer;
};
select_generate_sel_table(sel);
};
static inline cmph_uint32 _select_query(cmph_uint8 * bits_table, cmph_uint32 * select_table, cmph_uint32 one_idx)
{
register cmph_uint32 vec_bit_idx ,vec_byte_idx;
register cmph_uint32 part_sum, old_part_sum;
vec_bit_idx = select_table[one_idx >> NBITS_STEP_SELECT_TABLE]; // one_idx >> NBITS_STEP_SELECT_TABLE = one_idx/STEP_SELECT_TABLE
vec_byte_idx = vec_bit_idx >> 3; // vec_bit_idx / 8
one_idx &= MASK_STEP_SELECT_TABLE; // one_idx %= STEP_SELECT_TABLE == one_idx &= MASK_STEP_SELECT_TABLE
one_idx += rank_lookup_table[bits_table[vec_byte_idx] & ((1 << (vec_bit_idx & 0x7)) - 1)];
part_sum = 0;
do
{
old_part_sum = part_sum;
part_sum += rank_lookup_table[bits_table[vec_byte_idx]];
vec_byte_idx++;
}while (part_sum <= one_idx);
return select_lookup_table[bits_table[vec_byte_idx - 1]][one_idx - old_part_sum] + ((vec_byte_idx-1) << 3);
}
cmph_uint32 select_query(select_t * sel, cmph_uint32 one_idx)
{
return _select_query((cmph_uint8 *)sel->bits_vec, sel->select_table, one_idx);
};
static inline cmph_uint32 _select_next_query(cmph_uint8 * bits_table, cmph_uint32 vec_bit_idx)
{
register cmph_uint32 vec_byte_idx, one_idx;
register cmph_uint32 part_sum, old_part_sum;
vec_byte_idx = vec_bit_idx >> 3;
one_idx = rank_lookup_table[bits_table[vec_byte_idx] & ((1U << (vec_bit_idx & 0x7)) - 1U)] + 1U;
part_sum = 0;
do
{
old_part_sum = part_sum;
part_sum += rank_lookup_table[bits_table[vec_byte_idx]];
vec_byte_idx++;
}while (part_sum <= one_idx);
return select_lookup_table[bits_table[(vec_byte_idx - 1)]][(one_idx - old_part_sum)] + ((vec_byte_idx - 1) << 3);
}
cmph_uint32 select_next_query(select_t * sel, cmph_uint32 vec_bit_idx)
{
return _select_next_query((cmph_uint8 *)sel->bits_vec, vec_bit_idx);
};
void select_dump(select_t *sel, char **buf, cmph_uint32 *buflen)
{
register cmph_uint32 nbits = sel->n + sel->m;
register cmph_uint32 vec_size = ((nbits + 31) >> 5) * (cmph_uint32)sizeof(cmph_uint32); // (nbits + 31) >> 5 = (nbits + 31)/32
register cmph_uint32 sel_table_size = ((sel->n >> NBITS_STEP_SELECT_TABLE) + 1) * (cmph_uint32)sizeof(cmph_uint32); // (sel->n >> NBITS_STEP_SELECT_TABLE) = (sel->n/STEP_SELECT_TABLE)
register cmph_uint32 pos = 0;
*buflen = 2*(cmph_uint32)sizeof(cmph_uint32) + vec_size + sel_table_size;
*buf = (char *)calloc(*buflen, sizeof(char));
if (!*buf)
{
*buflen = UINT_MAX;
return;
}
memcpy(*buf, &(sel->n), sizeof(cmph_uint32));
pos += (cmph_uint32)sizeof(cmph_uint32);
memcpy(*buf + pos, &(sel->m), sizeof(cmph_uint32));
pos += (cmph_uint32)sizeof(cmph_uint32);
memcpy(*buf + pos, sel->bits_vec, vec_size);
pos += vec_size;
memcpy(*buf + pos, sel->select_table, sel_table_size);
DEBUGP("Dumped select structure with size %u bytes\n", *buflen);
}
void select_load(select_t * sel, const char *buf, cmph_uint32 buflen)
{
register cmph_uint32 pos = 0;
register cmph_uint32 nbits = 0;
register cmph_uint32 vec_size = 0;
register cmph_uint32 sel_table_size = 0;
memcpy(&(sel->n), buf, sizeof(cmph_uint32));
pos += (cmph_uint32)sizeof(cmph_uint32);
memcpy(&(sel->m), buf + pos, sizeof(cmph_uint32));
pos += (cmph_uint32)sizeof(cmph_uint32);
nbits = sel->n + sel->m;
vec_size = ((nbits + 31) >> 5) * (cmph_uint32)sizeof(cmph_uint32); // (nbits + 31) >> 5 = (nbits + 31)/32
sel_table_size = ((sel->n >> NBITS_STEP_SELECT_TABLE) + 1) * (cmph_uint32)sizeof(cmph_uint32); // (sel->n >> NBITS_STEP_SELECT_TABLE) = (sel->n/STEP_SELECT_TABLE)
if(sel->bits_vec)
{
free(sel->bits_vec);
}
sel->bits_vec = (cmph_uint32 *)calloc(vec_size/sizeof(cmph_uint32), sizeof(cmph_uint32));
if(sel->select_table)
{
free(sel->select_table);
}
sel->select_table = (cmph_uint32 *)calloc(sel_table_size/sizeof(cmph_uint32), sizeof(cmph_uint32));
memcpy(sel->bits_vec, buf + pos, vec_size);
pos += vec_size;
memcpy(sel->select_table, buf + pos, sel_table_size);
DEBUGP("Loaded select structure with size %u bytes\n", buflen);
}
/** \fn void select_pack(select_t *sel, void *sel_packed);
* \brief Support the ability to pack a select structure function into a preallocated contiguous memory space pointed by sel_packed.
* \param sel points to the select structure
* \param sel_packed pointer to the contiguous memory area used to store the select structure. The size of sel_packed must be at least @see select_packed_size
*/
void select_pack(select_t *sel, void *sel_packed)
{
if (sel && sel_packed)
{
char *buf = NULL;
cmph_uint32 buflen = 0;
select_dump(sel, &buf, &buflen);
memcpy(sel_packed, buf, buflen);
free(buf);
}
}
/** \fn cmph_uint32 select_packed_size(select_t *sel);
* \brief Return the amount of space needed to pack a select structure.
* \return the size of the packed select structure or zero for failures
*/
cmph_uint32 select_packed_size(select_t *sel)
{
register cmph_uint32 nbits = sel->n + sel->m;
register cmph_uint32 vec_size = ((nbits + 31) >> 5) * (cmph_uint32)sizeof(cmph_uint32); // (nbits + 31) >> 5 = (nbits + 31)/32
register cmph_uint32 sel_table_size = ((sel->n >> NBITS_STEP_SELECT_TABLE) + 1) * (cmph_uint32)sizeof(cmph_uint32); // (sel->n >> NBITS_STEP_SELECT_TABLE) = (sel->n/STEP_SELECT_TABLE)
return 2*(cmph_uint32)sizeof(cmph_uint32) + vec_size + sel_table_size;
}
cmph_uint32 select_query_packed(void * sel_packed, cmph_uint32 one_idx)
{
register cmph_uint32 *ptr = (cmph_uint32 *)sel_packed;
register cmph_uint32 n = *ptr++;
register cmph_uint32 m = *ptr++;
register cmph_uint32 nbits = n + m;
register cmph_uint32 vec_size = (nbits + 31) >> 5; // (nbits + 31) >> 5 = (nbits + 31)/32
register cmph_uint8 * bits_vec = (cmph_uint8 *)ptr;
register cmph_uint32 * select_table = ptr + vec_size;
return _select_query(bits_vec, select_table, one_idx);
}
cmph_uint32 select_next_query_packed(void * sel_packed, cmph_uint32 vec_bit_idx)
{
register cmph_uint8 * bits_vec = (cmph_uint8 *)sel_packed;
bits_vec += 8; // skipping n and m
return _select_next_query(bits_vec, vec_bit_idx);
}

View File

@ -0,0 +1,61 @@
#ifndef __CMPH_SELECT_H__
#define __CMPH_SELECT_H__
#include "cmph_types.h"
struct _select_t
{
cmph_uint32 n,m;
cmph_uint32 * bits_vec;
cmph_uint32 * select_table;
};
typedef struct _select_t select_t;
void select_init(select_t * sel);
void select_destroy(select_t * sel);
void select_generate(select_t * sel, cmph_uint32 * keys_vec, cmph_uint32 n, cmph_uint32 m);
cmph_uint32 select_query(select_t * sel, cmph_uint32 one_idx);
cmph_uint32 select_next_query(select_t * sel, cmph_uint32 vec_bit_idx);
cmph_uint32 select_get_space_usage(select_t * sel);
void select_dump(select_t *sel, char **buf, cmph_uint32 *buflen);
void select_load(select_t * sel, const char *buf, cmph_uint32 buflen);
/** \fn void select_pack(select_t *sel, void *sel_packed);
* \brief Support the ability to pack a select structure into a preallocated contiguous memory space pointed by sel_packed.
* \param sel points to the select structure
* \param sel_packed pointer to the contiguous memory area used to store the select structure. The size of sel_packed must be at least @see select_packed_size
*/
void select_pack(select_t *sel, void *sel_packed);
/** \fn cmph_uint32 select_packed_size(select_t *sel);
* \brief Return the amount of space needed to pack a select structure.
* \return the size of the packed select structure or zero for failures
*/
cmph_uint32 select_packed_size(select_t *sel);
/** \fn cmph_uint32 select_query_packed(void * sel_packed, cmph_uint32 one_idx);
* \param sel_packed is a pointer to a contiguous memory area
* \param one_idx is the rank for which we want to calculate the inverse function select
* \return an integer that represents the select value of rank idx.
*/
cmph_uint32 select_query_packed(void * sel_packed, cmph_uint32 one_idx);
/** \fn cmph_uint32 select_next_query_packed(void * sel_packed, cmph_uint32 vec_bit_idx);
* \param sel_packed is a pointer to a contiguous memory area
* \param vec_bit_idx is a value prior computed by @see select_query_packed
* \return an integer that represents the next select value greater than @see vec_bit_idx.
*/
cmph_uint32 select_next_query_packed(void * sel_packed, cmph_uint32 vec_bit_idx);
#endif

View File

@ -0,0 +1,170 @@
#ifndef SELECT_LOOKUP_TABLES
#define SELECT_LOOKUP_TABLES
#include "cmph_types.h"
/*
rank_lookup_table[i] simply gives the number of bits set to one in the byte of value i.
For example if i = 01010101 in binary then we have :
rank_lookup_table[i] = 4
*/
static cmph_uint8 rank_lookup_table[256] ={
0 , 1 , 1 , 2 , 1 , 2 , 2 , 3 , 1 , 2 , 2 , 3 , 2 , 3 , 3 , 4
, 1 , 2 , 2 , 3 , 2 , 3 , 3 , 4 , 2 , 3 , 3 , 4 , 3 , 4 , 4 , 5
, 1 , 2 , 2 , 3 , 2 , 3 , 3 , 4 , 2 , 3 , 3 , 4 , 3 , 4 , 4 , 5
, 2 , 3 , 3 , 4 , 3 , 4 , 4 , 5 , 3 , 4 , 4 , 5 , 4 , 5 , 5 , 6
, 1 , 2 , 2 , 3 , 2 , 3 , 3 , 4 , 2 , 3 , 3 , 4 , 3 , 4 , 4 , 5
, 2 , 3 , 3 , 4 , 3 , 4 , 4 , 5 , 3 , 4 , 4 , 5 , 4 , 5 , 5 , 6
, 2 , 3 , 3 , 4 , 3 , 4 , 4 , 5 , 3 , 4 , 4 , 5 , 4 , 5 , 5 , 6
, 3 , 4 , 4 , 5 , 4 , 5 , 5 , 6 , 4 , 5 , 5 , 6 , 5 , 6 , 6 , 7
, 1 , 2 , 2 , 3 , 2 , 3 , 3 , 4 , 2 , 3 , 3 , 4 , 3 , 4 , 4 , 5
, 2 , 3 , 3 , 4 , 3 , 4 , 4 , 5 , 3 , 4 , 4 , 5 , 4 , 5 , 5 , 6
, 2 , 3 , 3 , 4 , 3 , 4 , 4 , 5 , 3 , 4 , 4 , 5 , 4 , 5 , 5 , 6
, 3 , 4 , 4 , 5 , 4 , 5 , 5 , 6 , 4 , 5 , 5 , 6 , 5 , 6 , 6 , 7
, 2 , 3 , 3 , 4 , 3 , 4 , 4 , 5 , 3 , 4 , 4 , 5 , 4 , 5 , 5 , 6
, 3 , 4 , 4 , 5 , 4 , 5 , 5 , 6 , 4 , 5 , 5 , 6 , 5 , 6 , 6 , 7
, 3 , 4 , 4 , 5 , 4 , 5 , 5 , 6 , 4 , 5 , 5 , 6 , 5 , 6 , 6 , 7
, 4 , 5 , 5 , 6 , 5 , 6 , 6 , 7 , 5 , 6 , 6 , 7 , 6 , 7 , 7 , 8
};
/*
select_lookup_table[i][j] simply gives the index of the j'th bit set to one in the byte of value i.
For example if i=01010101 in binary then we have :
select_lookup_table[i][0] = 0, the first bit set to one is at position 0
select_lookup_table[i][1] = 2, the second bit set to one is at position 2
select_lookup_table[i][2] = 4, the third bit set to one is at position 4
select_lookup_table[i][3] = 6, the fourth bit set to one is at position 6
select_lookup_table[i][4] = 255, there is no more than 4 bits set to one in i, so we return escape value 255.
*/
static cmph_uint8 select_lookup_table[256][8]={
{ 255 , 255 , 255 , 255 , 255 , 255 , 255 , 255 } , { 0 , 255 , 255 , 255 , 255 , 255 , 255 , 255 } ,
{ 1 , 255 , 255 , 255 , 255 , 255 , 255 , 255 } , { 0 , 1 , 255 , 255 , 255 , 255 , 255 , 255 } ,
{ 2 , 255 , 255 , 255 , 255 , 255 , 255 , 255 } , { 0 , 2 , 255 , 255 , 255 , 255 , 255 , 255 } ,
{ 1 , 2 , 255 , 255 , 255 , 255 , 255 , 255 } , { 0 , 1 , 2 , 255 , 255 , 255 , 255 , 255 } ,
{ 3 , 255 , 255 , 255 , 255 , 255 , 255 , 255 } , { 0 , 3 , 255 , 255 , 255 , 255 , 255 , 255 } ,
{ 1 , 3 , 255 , 255 , 255 , 255 , 255 , 255 } , { 0 , 1 , 3 , 255 , 255 , 255 , 255 , 255 } ,
{ 2 , 3 , 255 , 255 , 255 , 255 , 255 , 255 } , { 0 , 2 , 3 , 255 , 255 , 255 , 255 , 255 } ,
{ 1 , 2 , 3 , 255 , 255 , 255 , 255 , 255 } , { 0 , 1 , 2 , 3 , 255 , 255 , 255 , 255 } ,
{ 4 , 255 , 255 , 255 , 255 , 255 , 255 , 255 } , { 0 , 4 , 255 , 255 , 255 , 255 , 255 , 255 } ,
{ 1 , 4 , 255 , 255 , 255 , 255 , 255 , 255 } , { 0 , 1 , 4 , 255 , 255 , 255 , 255 , 255 } ,
{ 2 , 4 , 255 , 255 , 255 , 255 , 255 , 255 } , { 0 , 2 , 4 , 255 , 255 , 255 , 255 , 255 } ,
{ 1 , 2 , 4 , 255 , 255 , 255 , 255 , 255 } , { 0 , 1 , 2 , 4 , 255 , 255 , 255 , 255 } ,
{ 3 , 4 , 255 , 255 , 255 , 255 , 255 , 255 } , { 0 , 3 , 4 , 255 , 255 , 255 , 255 , 255 } ,
{ 1 , 3 , 4 , 255 , 255 , 255 , 255 , 255 } , { 0 , 1 , 3 , 4 , 255 , 255 , 255 , 255 } ,
{ 2 , 3 , 4 , 255 , 255 , 255 , 255 , 255 } , { 0 , 2 , 3 , 4 , 255 , 255 , 255 , 255 } ,
{ 1 , 2 , 3 , 4 , 255 , 255 , 255 , 255 } , { 0 , 1 , 2 , 3 , 4 , 255 , 255 , 255 } ,
{ 5 , 255 , 255 , 255 , 255 , 255 , 255 , 255 } , { 0 , 5 , 255 , 255 , 255 , 255 , 255 , 255 } ,
{ 1 , 5 , 255 , 255 , 255 , 255 , 255 , 255 } , { 0 , 1 , 5 , 255 , 255 , 255 , 255 , 255 } ,
{ 2 , 5 , 255 , 255 , 255 , 255 , 255 , 255 } , { 0 , 2 , 5 , 255 , 255 , 255 , 255 , 255 } ,
{ 1 , 2 , 5 , 255 , 255 , 255 , 255 , 255 } , { 0 , 1 , 2 , 5 , 255 , 255 , 255 , 255 } ,
{ 3 , 5 , 255 , 255 , 255 , 255 , 255 , 255 } , { 0 , 3 , 5 , 255 , 255 , 255 , 255 , 255 } ,
{ 1 , 3 , 5 , 255 , 255 , 255 , 255 , 255 } , { 0 , 1 , 3 , 5 , 255 , 255 , 255 , 255 } ,
{ 2 , 3 , 5 , 255 , 255 , 255 , 255 , 255 } , { 0 , 2 , 3 , 5 , 255 , 255 , 255 , 255 } ,
{ 1 , 2 , 3 , 5 , 255 , 255 , 255 , 255 } , { 0 , 1 , 2 , 3 , 5 , 255 , 255 , 255 } ,
{ 4 , 5 , 255 , 255 , 255 , 255 , 255 , 255 } , { 0 , 4 , 5 , 255 , 255 , 255 , 255 , 255 } ,
{ 1 , 4 , 5 , 255 , 255 , 255 , 255 , 255 } , { 0 , 1 , 4 , 5 , 255 , 255 , 255 , 255 } ,
{ 2 , 4 , 5 , 255 , 255 , 255 , 255 , 255 } , { 0 , 2 , 4 , 5 , 255 , 255 , 255 , 255 } ,
{ 1 , 2 , 4 , 5 , 255 , 255 , 255 , 255 } , { 0 , 1 , 2 , 4 , 5 , 255 , 255 , 255 } ,
{ 3 , 4 , 5 , 255 , 255 , 255 , 255 , 255 } , { 0 , 3 , 4 , 5 , 255 , 255 , 255 , 255 } ,
{ 1 , 3 , 4 , 5 , 255 , 255 , 255 , 255 } , { 0 , 1 , 3 , 4 , 5 , 255 , 255 , 255 } ,
{ 2 , 3 , 4 , 5 , 255 , 255 , 255 , 255 } , { 0 , 2 , 3 , 4 , 5 , 255 , 255 , 255 } ,
{ 1 , 2 , 3 , 4 , 5 , 255 , 255 , 255 } , { 0 , 1 , 2 , 3 , 4 , 5 , 255 , 255 } ,
{ 6 , 255 , 255 , 255 , 255 , 255 , 255 , 255 } , { 0 , 6 , 255 , 255 , 255 , 255 , 255 , 255 } ,
{ 1 , 6 , 255 , 255 , 255 , 255 , 255 , 255 } , { 0 , 1 , 6 , 255 , 255 , 255 , 255 , 255 } ,
{ 2 , 6 , 255 , 255 , 255 , 255 , 255 , 255 } , { 0 , 2 , 6 , 255 , 255 , 255 , 255 , 255 } ,
{ 1 , 2 , 6 , 255 , 255 , 255 , 255 , 255 } , { 0 , 1 , 2 , 6 , 255 , 255 , 255 , 255 } ,
{ 3 , 6 , 255 , 255 , 255 , 255 , 255 , 255 } , { 0 , 3 , 6 , 255 , 255 , 255 , 255 , 255 } ,
{ 1 , 3 , 6 , 255 , 255 , 255 , 255 , 255 } , { 0 , 1 , 3 , 6 , 255 , 255 , 255 , 255 } ,
{ 2 , 3 , 6 , 255 , 255 , 255 , 255 , 255 } , { 0 , 2 , 3 , 6 , 255 , 255 , 255 , 255 } ,
{ 1 , 2 , 3 , 6 , 255 , 255 , 255 , 255 } , { 0 , 1 , 2 , 3 , 6 , 255 , 255 , 255 } ,
{ 4 , 6 , 255 , 255 , 255 , 255 , 255 , 255 } , { 0 , 4 , 6 , 255 , 255 , 255 , 255 , 255 } ,
{ 1 , 4 , 6 , 255 , 255 , 255 , 255 , 255 } , { 0 , 1 , 4 , 6 , 255 , 255 , 255 , 255 } ,
{ 2 , 4 , 6 , 255 , 255 , 255 , 255 , 255 } , { 0 , 2 , 4 , 6 , 255 , 255 , 255 , 255 } ,
{ 1 , 2 , 4 , 6 , 255 , 255 , 255 , 255 } , { 0 , 1 , 2 , 4 , 6 , 255 , 255 , 255 } ,
{ 3 , 4 , 6 , 255 , 255 , 255 , 255 , 255 } , { 0 , 3 , 4 , 6 , 255 , 255 , 255 , 255 } ,
{ 1 , 3 , 4 , 6 , 255 , 255 , 255 , 255 } , { 0 , 1 , 3 , 4 , 6 , 255 , 255 , 255 } ,
{ 2 , 3 , 4 , 6 , 255 , 255 , 255 , 255 } , { 0 , 2 , 3 , 4 , 6 , 255 , 255 , 255 } ,
{ 1 , 2 , 3 , 4 , 6 , 255 , 255 , 255 } , { 0 , 1 , 2 , 3 , 4 , 6 , 255 , 255 } ,
{ 5 , 6 , 255 , 255 , 255 , 255 , 255 , 255 } , { 0 , 5 , 6 , 255 , 255 , 255 , 255 , 255 } ,
{ 1 , 5 , 6 , 255 , 255 , 255 , 255 , 255 } , { 0 , 1 , 5 , 6 , 255 , 255 , 255 , 255 } ,
{ 2 , 5 , 6 , 255 , 255 , 255 , 255 , 255 } , { 0 , 2 , 5 , 6 , 255 , 255 , 255 , 255 } ,
{ 1 , 2 , 5 , 6 , 255 , 255 , 255 , 255 } , { 0 , 1 , 2 , 5 , 6 , 255 , 255 , 255 } ,
{ 3 , 5 , 6 , 255 , 255 , 255 , 255 , 255 } , { 0 , 3 , 5 , 6 , 255 , 255 , 255 , 255 } ,
{ 1 , 3 , 5 , 6 , 255 , 255 , 255 , 255 } , { 0 , 1 , 3 , 5 , 6 , 255 , 255 , 255 } ,
{ 2 , 3 , 5 , 6 , 255 , 255 , 255 , 255 } , { 0 , 2 , 3 , 5 , 6 , 255 , 255 , 255 } ,
{ 1 , 2 , 3 , 5 , 6 , 255 , 255 , 255 } , { 0 , 1 , 2 , 3 , 5 , 6 , 255 , 255 } ,
{ 4 , 5 , 6 , 255 , 255 , 255 , 255 , 255 } , { 0 , 4 , 5 , 6 , 255 , 255 , 255 , 255 } ,
{ 1 , 4 , 5 , 6 , 255 , 255 , 255 , 255 } , { 0 , 1 , 4 , 5 , 6 , 255 , 255 , 255 } ,
{ 2 , 4 , 5 , 6 , 255 , 255 , 255 , 255 } , { 0 , 2 , 4 , 5 , 6 , 255 , 255 , 255 } ,
{ 1 , 2 , 4 , 5 , 6 , 255 , 255 , 255 } , { 0 , 1 , 2 , 4 , 5 , 6 , 255 , 255 } ,
{ 3 , 4 , 5 , 6 , 255 , 255 , 255 , 255 } , { 0 , 3 , 4 , 5 , 6 , 255 , 255 , 255 } ,
{ 1 , 3 , 4 , 5 , 6 , 255 , 255 , 255 } , { 0 , 1 , 3 , 4 , 5 , 6 , 255 , 255 } ,
{ 2 , 3 , 4 , 5 , 6 , 255 , 255 , 255 } , { 0 , 2 , 3 , 4 , 5 , 6 , 255 , 255 } ,
{ 1 , 2 , 3 , 4 , 5 , 6 , 255 , 255 } , { 0 , 1 , 2 , 3 , 4 , 5 , 6 , 255 } ,
{ 7 , 255 , 255 , 255 , 255 , 255 , 255 , 255 } , { 0 , 7 , 255 , 255 , 255 , 255 , 255 , 255 } ,
{ 1 , 7 , 255 , 255 , 255 , 255 , 255 , 255 } , { 0 , 1 , 7 , 255 , 255 , 255 , 255 , 255 } ,
{ 2 , 7 , 255 , 255 , 255 , 255 , 255 , 255 } , { 0 , 2 , 7 , 255 , 255 , 255 , 255 , 255 } ,
{ 1 , 2 , 7 , 255 , 255 , 255 , 255 , 255 } , { 0 , 1 , 2 , 7 , 255 , 255 , 255 , 255 } ,
{ 3 , 7 , 255 , 255 , 255 , 255 , 255 , 255 } , { 0 , 3 , 7 , 255 , 255 , 255 , 255 , 255 } ,
{ 1 , 3 , 7 , 255 , 255 , 255 , 255 , 255 } , { 0 , 1 , 3 , 7 , 255 , 255 , 255 , 255 } ,
{ 2 , 3 , 7 , 255 , 255 , 255 , 255 , 255 } , { 0 , 2 , 3 , 7 , 255 , 255 , 255 , 255 } ,
{ 1 , 2 , 3 , 7 , 255 , 255 , 255 , 255 } , { 0 , 1 , 2 , 3 , 7 , 255 , 255 , 255 } ,
{ 4 , 7 , 255 , 255 , 255 , 255 , 255 , 255 } , { 0 , 4 , 7 , 255 , 255 , 255 , 255 , 255 } ,
{ 1 , 4 , 7 , 255 , 255 , 255 , 255 , 255 } , { 0 , 1 , 4 , 7 , 255 , 255 , 255 , 255 } ,
{ 2 , 4 , 7 , 255 , 255 , 255 , 255 , 255 } , { 0 , 2 , 4 , 7 , 255 , 255 , 255 , 255 } ,
{ 1 , 2 , 4 , 7 , 255 , 255 , 255 , 255 } , { 0 , 1 , 2 , 4 , 7 , 255 , 255 , 255 } ,
{ 3 , 4 , 7 , 255 , 255 , 255 , 255 , 255 } , { 0 , 3 , 4 , 7 , 255 , 255 , 255 , 255 } ,
{ 1 , 3 , 4 , 7 , 255 , 255 , 255 , 255 } , { 0 , 1 , 3 , 4 , 7 , 255 , 255 , 255 } ,
{ 2 , 3 , 4 , 7 , 255 , 255 , 255 , 255 } , { 0 , 2 , 3 , 4 , 7 , 255 , 255 , 255 } ,
{ 1 , 2 , 3 , 4 , 7 , 255 , 255 , 255 } , { 0 , 1 , 2 , 3 , 4 , 7 , 255 , 255 } ,
{ 5 , 7 , 255 , 255 , 255 , 255 , 255 , 255 } , { 0 , 5 , 7 , 255 , 255 , 255 , 255 , 255 } ,
{ 1 , 5 , 7 , 255 , 255 , 255 , 255 , 255 } , { 0 , 1 , 5 , 7 , 255 , 255 , 255 , 255 } ,
{ 2 , 5 , 7 , 255 , 255 , 255 , 255 , 255 } , { 0 , 2 , 5 , 7 , 255 , 255 , 255 , 255 } ,
{ 1 , 2 , 5 , 7 , 255 , 255 , 255 , 255 } , { 0 , 1 , 2 , 5 , 7 , 255 , 255 , 255 } ,
{ 3 , 5 , 7 , 255 , 255 , 255 , 255 , 255 } , { 0 , 3 , 5 , 7 , 255 , 255 , 255 , 255 } ,
{ 1 , 3 , 5 , 7 , 255 , 255 , 255 , 255 } , { 0 , 1 , 3 , 5 , 7 , 255 , 255 , 255 } ,
{ 2 , 3 , 5 , 7 , 255 , 255 , 255 , 255 } , { 0 , 2 , 3 , 5 , 7 , 255 , 255 , 255 } ,
{ 1 , 2 , 3 , 5 , 7 , 255 , 255 , 255 } , { 0 , 1 , 2 , 3 , 5 , 7 , 255 , 255 } ,
{ 4 , 5 , 7 , 255 , 255 , 255 , 255 , 255 } , { 0 , 4 , 5 , 7 , 255 , 255 , 255 , 255 } ,
{ 1 , 4 , 5 , 7 , 255 , 255 , 255 , 255 } , { 0 , 1 , 4 , 5 , 7 , 255 , 255 , 255 } ,
{ 2 , 4 , 5 , 7 , 255 , 255 , 255 , 255 } , { 0 , 2 , 4 , 5 , 7 , 255 , 255 , 255 } ,
{ 1 , 2 , 4 , 5 , 7 , 255 , 255 , 255 } , { 0 , 1 , 2 , 4 , 5 , 7 , 255 , 255 } ,
{ 3 , 4 , 5 , 7 , 255 , 255 , 255 , 255 } , { 0 , 3 , 4 , 5 , 7 , 255 , 255 , 255 } ,
{ 1 , 3 , 4 , 5 , 7 , 255 , 255 , 255 } , { 0 , 1 , 3 , 4 , 5 , 7 , 255 , 255 } ,
{ 2 , 3 , 4 , 5 , 7 , 255 , 255 , 255 } , { 0 , 2 , 3 , 4 , 5 , 7 , 255 , 255 } ,
{ 1 , 2 , 3 , 4 , 5 , 7 , 255 , 255 } , { 0 , 1 , 2 , 3 , 4 , 5 , 7 , 255 } ,
{ 6 , 7 , 255 , 255 , 255 , 255 , 255 , 255 } , { 0 , 6 , 7 , 255 , 255 , 255 , 255 , 255 } ,
{ 1 , 6 , 7 , 255 , 255 , 255 , 255 , 255 } , { 0 , 1 , 6 , 7 , 255 , 255 , 255 , 255 } ,
{ 2 , 6 , 7 , 255 , 255 , 255 , 255 , 255 } , { 0 , 2 , 6 , 7 , 255 , 255 , 255 , 255 } ,
{ 1 , 2 , 6 , 7 , 255 , 255 , 255 , 255 } , { 0 , 1 , 2 , 6 , 7 , 255 , 255 , 255 } ,
{ 3 , 6 , 7 , 255 , 255 , 255 , 255 , 255 } , { 0 , 3 , 6 , 7 , 255 , 255 , 255 , 255 } ,
{ 1 , 3 , 6 , 7 , 255 , 255 , 255 , 255 } , { 0 , 1 , 3 , 6 , 7 , 255 , 255 , 255 } ,
{ 2 , 3 , 6 , 7 , 255 , 255 , 255 , 255 } , { 0 , 2 , 3 , 6 , 7 , 255 , 255 , 255 } ,
{ 1 , 2 , 3 , 6 , 7 , 255 , 255 , 255 } , { 0 , 1 , 2 , 3 , 6 , 7 , 255 , 255 } ,
{ 4 , 6 , 7 , 255 , 255 , 255 , 255 , 255 } , { 0 , 4 , 6 , 7 , 255 , 255 , 255 , 255 } ,
{ 1 , 4 , 6 , 7 , 255 , 255 , 255 , 255 } , { 0 , 1 , 4 , 6 , 7 , 255 , 255 , 255 } ,
{ 2 , 4 , 6 , 7 , 255 , 255 , 255 , 255 } , { 0 , 2 , 4 , 6 , 7 , 255 , 255 , 255 } ,
{ 1 , 2 , 4 , 6 , 7 , 255 , 255 , 255 } , { 0 , 1 , 2 , 4 , 6 , 7 , 255 , 255 } ,
{ 3 , 4 , 6 , 7 , 255 , 255 , 255 , 255 } , { 0 , 3 , 4 , 6 , 7 , 255 , 255 , 255 } ,
{ 1 , 3 , 4 , 6 , 7 , 255 , 255 , 255 } , { 0 , 1 , 3 , 4 , 6 , 7 , 255 , 255 } ,
{ 2 , 3 , 4 , 6 , 7 , 255 , 255 , 255 } , { 0 , 2 , 3 , 4 , 6 , 7 , 255 , 255 } ,
{ 1 , 2 , 3 , 4 , 6 , 7 , 255 , 255 } , { 0 , 1 , 2 , 3 , 4 , 6 , 7 , 255 } ,
{ 5 , 6 , 7 , 255 , 255 , 255 , 255 , 255 } , { 0 , 5 , 6 , 7 , 255 , 255 , 255 , 255 } ,
{ 1 , 5 , 6 , 7 , 255 , 255 , 255 , 255 } , { 0 , 1 , 5 , 6 , 7 , 255 , 255 , 255 } ,
{ 2 , 5 , 6 , 7 , 255 , 255 , 255 , 255 } , { 0 , 2 , 5 , 6 , 7 , 255 , 255 , 255 } ,
{ 1 , 2 , 5 , 6 , 7 , 255 , 255 , 255 } , { 0 , 1 , 2 , 5 , 6 , 7 , 255 , 255 } ,
{ 3 , 5 , 6 , 7 , 255 , 255 , 255 , 255 } , { 0 , 3 , 5 , 6 , 7 , 255 , 255 , 255 } ,
{ 1 , 3 , 5 , 6 , 7 , 255 , 255 , 255 } , { 0 , 1 , 3 , 5 , 6 , 7 , 255 , 255 } ,
{ 2 , 3 , 5 , 6 , 7 , 255 , 255 , 255 } , { 0 , 2 , 3 , 5 , 6 , 7 , 255 , 255 } ,
{ 1 , 2 , 3 , 5 , 6 , 7 , 255 , 255 } , { 0 , 1 , 2 , 3 , 5 , 6 , 7 , 255 } ,
{ 4 , 5 , 6 , 7 , 255 , 255 , 255 , 255 } , { 0 , 4 , 5 , 6 , 7 , 255 , 255 , 255 } ,
{ 1 , 4 , 5 , 6 , 7 , 255 , 255 , 255 } , { 0 , 1 , 4 , 5 , 6 , 7 , 255 , 255 } ,
{ 2 , 4 , 5 , 6 , 7 , 255 , 255 , 255 } , { 0 , 2 , 4 , 5 , 6 , 7 , 255 , 255 } ,
{ 1 , 2 , 4 , 5 , 6 , 7 , 255 , 255 } , { 0 , 1 , 2 , 4 , 5 , 6 , 7 , 255 } ,
{ 3 , 4 , 5 , 6 , 7 , 255 , 255 , 255 } , { 0 , 3 , 4 , 5 , 6 , 7 , 255 , 255 } ,
{ 1 , 3 , 4 , 5 , 6 , 7 , 255 , 255 } , { 0 , 1 , 3 , 4 , 5 , 6 , 7 , 255 } ,
{ 2 , 3 , 4 , 5 , 6 , 7 , 255 , 255 } , { 0 , 2 , 3 , 4 , 5 , 6 , 7 , 255 } ,
{ 1 , 2 , 3 , 4 , 5 , 6 , 7 , 255 } , { 0 , 1 , 2 , 3 , 4 , 5 , 6 , 7 } };
#endif

View File

@ -0,0 +1,51 @@
#include "vqueue.h"
#include <stdio.h>
#include <assert.h>
#include <stdlib.h>
struct __vqueue_t
{
cmph_uint32 * values;
cmph_uint32 beg, end, capacity;
};
vqueue_t * vqueue_new(cmph_uint32 capacity)
{
size_t capacity_plus_one = capacity + 1;
vqueue_t *q = (vqueue_t *)malloc(sizeof(vqueue_t));
assert(q);
q->values = (cmph_uint32 *)calloc(capacity_plus_one, sizeof(cmph_uint32));
q->beg = q->end = 0;
q->capacity = (cmph_uint32) capacity_plus_one;
return q;
}
cmph_uint8 vqueue_is_empty(vqueue_t * q)
{
return (cmph_uint8)(q->beg == q->end);
}
void vqueue_insert(vqueue_t * q, cmph_uint32 val)
{
assert((q->end + 1)%q->capacity != q->beg); // Is queue full?
q->end = (q->end + 1)%q->capacity;
q->values[q->end] = val;
}
cmph_uint32 vqueue_remove(vqueue_t * q)
{
assert(!vqueue_is_empty(q)); // Is queue empty?
q->beg = (q->beg + 1)%q->capacity;
return q->values[q->beg];
}
void vqueue_print(vqueue_t * q)
{
cmph_uint32 i;
for (i = q->beg; i != q->end; i = (i + 1)%q->capacity)
fprintf(stderr, "%u\n", q->values[(i + 1)%q->capacity]);
}
void vqueue_destroy(vqueue_t *q)
{
free(q->values); q->values = NULL; free(q);
}

View File

@ -0,0 +1,18 @@
#ifndef __CMPH_VQUEUE_H__
#define __CMPH_VQUEUE_H__
#include "cmph_types.h"
typedef struct __vqueue_t vqueue_t;
vqueue_t * vqueue_new(cmph_uint32 capacity);
cmph_uint8 vqueue_is_empty(vqueue_t * q);
void vqueue_insert(vqueue_t * q, cmph_uint32 val);
cmph_uint32 vqueue_remove(vqueue_t * q);
void vqueue_print(vqueue_t * q);
void vqueue_destroy(vqueue_t * q);
#endif

View File

@ -0,0 +1,79 @@
#include "vstack.h"
#include <stdlib.h>
#include <assert.h>
//#define DEBUG
#include "debug.h"
struct __vstack_t
{
cmph_uint32 pointer;
cmph_uint32 *values;
cmph_uint32 capacity;
};
vstack_t *vstack_new(void)
{
vstack_t *stack = (vstack_t *)malloc(sizeof(vstack_t));
assert(stack);
stack->pointer = 0;
stack->values = NULL;
stack->capacity = 0;
return stack;
}
void vstack_destroy(vstack_t *stack)
{
assert(stack);
free(stack->values);
free(stack);
}
void vstack_push(vstack_t *stack, cmph_uint32 val)
{
assert(stack);
vstack_reserve(stack, stack->pointer + 1);
stack->values[stack->pointer] = val;
++(stack->pointer);
}
void vstack_pop(vstack_t *stack)
{
assert(stack);
assert(stack->pointer > 0);
--(stack->pointer);
}
cmph_uint32 vstack_top(vstack_t *stack)
{
assert(stack);
assert(stack->pointer > 0);
return stack->values[(stack->pointer - 1)];
}
int vstack_empty(vstack_t *stack)
{
assert(stack);
return stack->pointer == 0;
}
cmph_uint32 vstack_size(vstack_t *stack)
{
return stack->pointer;
}
void vstack_reserve(vstack_t *stack, cmph_uint32 size)
{
assert(stack);
if (stack->capacity < size)
{
cmph_uint32 new_capacity = stack->capacity + 1;
DEBUGP("Increasing current capacity %u to %u\n", stack->capacity, size);
while (new_capacity < size)
{
new_capacity *= 2;
}
stack->values = (cmph_uint32 *)realloc(stack->values, sizeof(cmph_uint32)*new_capacity);
assert(stack->values);
stack->capacity = new_capacity;
DEBUGP("Increased\n");
}
}

View File

@ -0,0 +1,18 @@
#ifndef __CMPH_VSTACK_H__
#define __CMPH_VSTACK_H__
#include "cmph_types.h"
typedef struct __vstack_t vstack_t;
vstack_t *vstack_new(void);
void vstack_destroy(vstack_t *stack);
void vstack_push(vstack_t *stack, cmph_uint32 val);
cmph_uint32 vstack_top(vstack_t *stack);
void vstack_pop(vstack_t *stack);
int vstack_empty(vstack_t *stack);
cmph_uint32 vstack_size(vstack_t *stack);
void vstack_reserve(vstack_t *stack, cmph_uint32 size);
#endif

View File

@ -0,0 +1,179 @@
#ifdef WIN32
/*****************************************************************************
*
* MODULE NAME : GETOPT.C
*
* COPYRIGHTS:
* This module contains code made available by IBM
* Corporation on an AS IS basis. Any one receiving the
* module is considered to be licensed under IBM copyrights
* to use the IBM-provided source code in any way he or she
* deems fit, including copying it, compiling it, modifying
* it, and redistributing it, with or without
* modifications. No license under any IBM patents or
* patent applications is to be implied from this copyright
* license.
*
* A user of the module should understand that IBM cannot
* provide technical support for the module and will not be
* responsible for any consequences of use of the program.
*
* Any notices, including this one, are not to be removed
* from the module without the prior written consent of
* IBM.
*
* AUTHOR: Original author:
* G. R. Blair (BOBBLAIR at AUSVM1)
* Internet: bobblair@bobblair.austin.ibm.com
*
* Extensively revised by:
* John Q. Walker II, Ph.D. (JOHHQ at RALVM6)
* Internet: johnq@ralvm6.vnet.ibm.com
*
*****************************************************************************/
/******************************************************************************
* getopt()
*
* The getopt() function is a command line parser. It returns the next
* option character in argv that matches an option character in opstring.
*
* The argv argument points to an array of argc+1 elements containing argc
* pointers to character strings followed by a null pointer.
*
* The opstring argument points to a string of option characters; if an
* option character is followed by a colon, the option is expected to have
* an argument that may or may not be separated from it by white space.
* The external variable optarg is set to point to the start of the option
* argument on return from getopt().
*
* The getopt() function places in optind the argv index of the next argument
* to be processed. The system initializes the external variable optind to
* 1 before the first call to getopt().
*
* When all options have been processed (that is, up to the first nonoption
* argument), getopt() returns EOF. The special option "--" may be used to
* delimit the end of the options; EOF will be returned, and "--" will be
* skipped.
*
* The getopt() function returns a question mark (?) when it encounters an
* option character not included in opstring. This error message can be
* disabled by setting opterr to zero. Otherwise, it returns the option
* character that was detected.
*
* If the special option "--" is detected, or all options have been
* processed, EOF is returned.
*
* Options are marked by either a minus sign (-) or a slash (/).
*
* No errors are defined.
*****************************************************************************/
#include <stdio.h> /* for EOF */
#include <string.h> /* for strchr() */
/* static (global) variables that are specified as exported by getopt() */
extern char *optarg; /* pointer to the start of the option argument */
extern int optind; /* number of the next argv[] to be evaluated */
extern int opterr; /* non-zero if a question mark should be returned
when a non-valid option character is detected */
/* handle possible future character set concerns by putting this in a macro */
#define _next_char(string) (char)(*(string+1))
int getopt(int argc, char *argv[], char *opstring)
{
static char *pIndexPosition = NULL; /* place inside current argv string */
char *pArgString = NULL; /* where to start from next */
char *pOptString; /* the string in our program */
if (pIndexPosition != NULL) {
/* we last left off inside an argv string */
if (*(++pIndexPosition)) {
/* there is more to come in the most recent argv */
pArgString = pIndexPosition;
}
}
if (pArgString == NULL) {
/* we didn't leave off in the middle of an argv string */
if (optind >= argc) {
/* more command-line arguments than the argument count */
pIndexPosition = NULL; /* not in the middle of anything */
return EOF; /* used up all command-line arguments */
}
/*---------------------------------------------------------------------
* If the next argv[] is not an option, there can be no more options.
*-------------------------------------------------------------------*/
pArgString = argv[optind++]; /* set this to the next argument ptr */
if (('/' != *pArgString) && /* doesn't start with a slash or a dash? */
('-' != *pArgString)) {
--optind; /* point to current arg once we're done */
optarg = NULL; /* no argument follows the option */
pIndexPosition = NULL; /* not in the middle of anything */
return EOF; /* used up all the command-line flags */
}
/* check for special end-of-flags markers */
if ((strcmp(pArgString, "-") == 0) ||
(strcmp(pArgString, "--") == 0)) {
optarg = NULL; /* no argument follows the option */
pIndexPosition = NULL; /* not in the middle of anything */
return EOF; /* encountered the special flag */
}
pArgString++; /* look past the / or - */
}
if (':' == *pArgString) { /* is it a colon? */
/*---------------------------------------------------------------------
* Rare case: if opterr is non-zero, return a question mark;
* otherwise, just return the colon we're on.
*-------------------------------------------------------------------*/
return (opterr ? (int)'?' : (int)':');
}
else if ((pOptString = strchr(opstring, *pArgString)) == 0) {
/*---------------------------------------------------------------------
* The letter on the command-line wasn't any good.
*-------------------------------------------------------------------*/
optarg = NULL; /* no argument follows the option */
pIndexPosition = NULL; /* not in the middle of anything */
return (opterr ? (int)'?' : (int)*pArgString);
}
else {
/*---------------------------------------------------------------------
* The letter on the command-line matches one we expect to see
*-------------------------------------------------------------------*/
if (':' == _next_char(pOptString)) { /* is the next letter a colon? */
/* It is a colon. Look for an argument string. */
if ('\0' != _next_char(pArgString)) { /* argument in this argv? */
optarg = &pArgString[1]; /* Yes, it is */
}
else {
/*-------------------------------------------------------------
* The argument string must be in the next argv.
* But, what if there is none (bad input from the user)?
* In that case, return the letter, and optarg as NULL.
*-----------------------------------------------------------*/
if (optind < argc)
optarg = argv[optind++];
else {
optarg = NULL;
return (opterr ? (int)'?' : (int)*pArgString);
}
}
pIndexPosition = NULL; /* not in the middle of anything */
}
else {
/* it's not a colon, so just return the letter */
optarg = NULL; /* no argument follows the option */
pIndexPosition = pArgString; /* point to the letter we're on */
}
return (int)*pArgString; /* return the letter that matched */
}
}
#endif //WIN32

View File

@ -0,0 +1,25 @@
#ifdef __cplusplus
extern "C" {
#endif
#ifndef WIN32
#include <getopt.h>
#else
#ifndef _GETOPT_
#define _GETOPT_
#include <stdio.h> /* for EOF */
#include <string.h> /* for strchr() */
char *optarg = NULL; /* pointer to the start of the option argument */
int optind = 1; /* number of the next argv[] to be evaluated */
int opterr = 1; /* non-zero if a question mark should be returned */
int getopt(int argc, char *argv[], char *opstring);
#endif //_GETOPT_
#endif //WIN32
#ifdef __cplusplus
}
#endif

35
girepository/docs.c Normal file
View File

@ -0,0 +1,35 @@
/* -*- mode: C; c-file-style: "gnu"; indent-tabs-mode: nil; -*-
* GObject introspection: Dump introspection data
*
* Copyright (C) 2013 Dieter Verfaillie <dieterv@optionexplicit.be>
*
* SPDX-License-Identifier: LGPL-2.1-or-later
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 02111-1307, USA.
*/
/* This file collects documentation for macros, typedefs and
* the like, which have no good home in any of the 'real' source
* files.
*/
/**
* SECTION:gicommontypes
* @title: Common Types
* @short_description: TODO
*
* TODO
*/

684
girepository/gdump.c Normal file
View File

@ -0,0 +1,684 @@
/* -*- mode: C; c-file-style: "gnu"; indent-tabs-mode: nil; -*-
* GObject introspection: Dump introspection data
*
* Copyright (C) 2008 Colin Walters <walters@verbum.org>
*
* SPDX-License-Identifier: LGPL-2.1-or-later
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 02111-1307, USA.
*/
/* This file is both compiled into libgirepository.so, and installed
* on the filesystem. But for the dumper, we want to avoid linking
* to libgirepository; see
* https://bugzilla.gnome.org/show_bug.cgi?id=630342
*/
#ifdef GI_COMPILATION
#include "config.h"
#include "girepository.h"
#endif
#include <glib.h>
#include <glib-object.h>
#include <gio/gio.h>
#include <stdlib.h>
#include <string.h>
static void
escaped_printf (GOutputStream *out, const char *fmt, ...) G_GNUC_PRINTF (2, 3);
static void
escaped_printf (GOutputStream *out, const char *fmt, ...)
{
char *str;
va_list args;
gsize written;
GError *error = NULL;
va_start (args, fmt);
str = g_markup_vprintf_escaped (fmt, args);
if (!g_output_stream_write_all (out, str, strlen (str), &written, NULL, &error))
{
g_critical ("failed to write to iochannel: %s", error->message);
g_clear_error (&error);
}
g_free (str);
va_end (args);
}
static void
goutput_write (GOutputStream *out, const char *str)
{
gsize written;
GError *error = NULL;
if (!g_output_stream_write_all (out, str, strlen (str), &written, NULL, &error))
{
g_critical ("failed to write to iochannel: %s", error->message);
g_clear_error (&error);
}
}
typedef GType (*GetTypeFunc)(void);
typedef GQuark (*ErrorQuarkFunc)(void);
static GType
invoke_get_type (GModule *self, const char *symbol, GError **error)
{
GetTypeFunc sym;
GType ret;
if (!g_module_symbol (self, symbol, (void**)&sym))
{
g_set_error (error,
G_IO_ERROR,
G_IO_ERROR_FAILED,
"Failed to find symbol '%s'", symbol);
return G_TYPE_INVALID;
}
ret = sym ();
if (ret == G_TYPE_INVALID)
{
g_set_error (error,
G_IO_ERROR,
G_IO_ERROR_FAILED,
"Function '%s' returned G_TYPE_INVALID", symbol);
}
return ret;
}
static GQuark
invoke_error_quark (GModule *self, const char *symbol, GError **error)
{
ErrorQuarkFunc sym;
if (!g_module_symbol (self, symbol, (void**)&sym))
{
g_set_error (error,
G_IO_ERROR,
G_IO_ERROR_FAILED,
"Failed to find symbol '%s'", symbol);
return G_TYPE_INVALID;
}
return sym ();
}
static char *
value_transform_to_string (const GValue *value)
{
GValue tmp = G_VALUE_INIT;
char *s = NULL;
g_value_init (&tmp, G_TYPE_STRING);
if (g_value_transform (value, &tmp))
{
const char *str = g_value_get_string (&tmp);
if (str != NULL)
s = g_strescape (str, NULL);
}
g_value_unset (&tmp);
return s;
}
/* A simpler version of g_strdup_value_contents(), but with stable
* output and less complex semantics
*/
static char *
value_to_string (const GValue *value)
{
if (value == NULL)
return NULL;
if (G_VALUE_HOLDS_STRING (value))
{
const char *s = g_value_get_string (value);
if (s == NULL)
return g_strdup ("NULL");
return g_strescape (s, NULL);
}
else
{
GType value_type = G_VALUE_TYPE (value);
switch (G_TYPE_FUNDAMENTAL (value_type))
{
case G_TYPE_BOXED:
if (g_value_get_boxed (value) == NULL)
return NULL;
else
return value_transform_to_string (value);
break;
case G_TYPE_OBJECT:
if (g_value_get_object (value) == NULL)
return NULL;
else
return value_transform_to_string (value);
break;
case G_TYPE_POINTER:
return NULL;
default:
return value_transform_to_string (value);
}
}
return NULL;
}
static void
dump_properties (GType type, GOutputStream *out)
{
guint i;
guint n_properties;
GParamSpec **props;
if (G_TYPE_FUNDAMENTAL (type) == G_TYPE_OBJECT)
{
GObjectClass *klass;
klass = g_type_class_ref (type);
props = g_object_class_list_properties (klass, &n_properties);
}
else
{
void *klass;
klass = g_type_default_interface_ref (type);
props = g_object_interface_list_properties (klass, &n_properties);
}
for (i = 0; i < n_properties; i++)
{
GParamSpec *prop;
prop = props[i];
if (prop->owner_type != type)
continue;
const GValue *v = g_param_spec_get_default_value (prop);
char *default_value = value_to_string (v);
if (v != NULL && default_value != NULL)
{
escaped_printf (out, " <property name=\"%s\" type=\"%s\" flags=\"%d\" default-value=\"%s\"/>\n",
prop->name,
g_type_name (prop->value_type),
prop->flags,
default_value);
}
else
{
escaped_printf (out, " <property name=\"%s\" type=\"%s\" flags=\"%d\"/>\n",
prop->name,
g_type_name (prop->value_type),
prop->flags);
}
g_free (default_value);
}
g_free (props);
}
static void
dump_signals (GType type, GOutputStream *out)
{
guint i;
guint n_sigs;
guint *sig_ids;
sig_ids = g_signal_list_ids (type, &n_sigs);
for (i = 0; i < n_sigs; i++)
{
guint sigid;
GSignalQuery query;
guint j;
sigid = sig_ids[i];
g_signal_query (sigid, &query);
escaped_printf (out, " <signal name=\"%s\" return=\"%s\"",
query.signal_name, g_type_name (query.return_type));
if (query.signal_flags & G_SIGNAL_RUN_FIRST)
escaped_printf (out, " when=\"first\"");
else if (query.signal_flags & G_SIGNAL_RUN_LAST)
escaped_printf (out, " when=\"last\"");
else if (query.signal_flags & G_SIGNAL_RUN_CLEANUP)
escaped_printf (out, " when=\"cleanup\"");
#if GLIB_CHECK_VERSION(2, 29, 15)
else if (query.signal_flags & G_SIGNAL_MUST_COLLECT)
escaped_printf (out, " when=\"must-collect\"");
#endif
if (query.signal_flags & G_SIGNAL_NO_RECURSE)
escaped_printf (out, " no-recurse=\"1\"");
if (query.signal_flags & G_SIGNAL_DETAILED)
escaped_printf (out, " detailed=\"1\"");
if (query.signal_flags & G_SIGNAL_ACTION)
escaped_printf (out, " action=\"1\"");
if (query.signal_flags & G_SIGNAL_NO_HOOKS)
escaped_printf (out, " no-hooks=\"1\"");
goutput_write (out, ">\n");
for (j = 0; j < query.n_params; j++)
{
escaped_printf (out, " <param type=\"%s\"/>\n",
g_type_name (query.param_types[j]));
}
goutput_write (out, " </signal>\n");
}
g_free (sig_ids);
}
static void
dump_object_type (GType type, const char *symbol, GOutputStream *out)
{
guint n_interfaces;
guint i;
GType *interfaces;
escaped_printf (out, " <class name=\"%s\" get-type=\"%s\"",
g_type_name (type), symbol);
if (type != G_TYPE_OBJECT)
{
GString *parent_str;
GType parent;
gboolean first = TRUE;
parent = g_type_parent (type);
parent_str = g_string_new ("");
while (parent != G_TYPE_INVALID)
{
if (first)
first = FALSE;
else
g_string_append_c (parent_str, ',');
g_string_append (parent_str, g_type_name (parent));
parent = g_type_parent (parent);
}
escaped_printf (out, " parents=\"%s\"", parent_str->str);
g_string_free (parent_str, TRUE);
}
if (G_TYPE_IS_ABSTRACT (type))
escaped_printf (out, " abstract=\"1\"");
#if GLIB_CHECK_VERSION (2, 70, 0)
if (G_TYPE_IS_FINAL (type))
escaped_printf (out, " final=\"1\"");
#endif
goutput_write (out, ">\n");
interfaces = g_type_interfaces (type, &n_interfaces);
for (i = 0; i < n_interfaces; i++)
{
GType itype = interfaces[i];
escaped_printf (out, " <implements name=\"%s\"/>\n",
g_type_name (itype));
}
g_free (interfaces);
dump_properties (type, out);
dump_signals (type, out);
goutput_write (out, " </class>\n");
}
static void
dump_interface_type (GType type, const char *symbol, GOutputStream *out)
{
guint n_interfaces;
guint i;
GType *interfaces;
escaped_printf (out, " <interface name=\"%s\" get-type=\"%s\">\n",
g_type_name (type), symbol);
interfaces = g_type_interface_prerequisites (type, &n_interfaces);
for (i = 0; i < n_interfaces; i++)
{
GType itype = interfaces[i];
if (itype == G_TYPE_OBJECT)
{
/* Treat this as implicit for now; in theory GInterfaces are
* supported on things like GstMiniObject, but right now
* the introspection system only supports GObject.
* http://bugzilla.gnome.org/show_bug.cgi?id=559706
*/
continue;
}
escaped_printf (out, " <prerequisite name=\"%s\"/>\n",
g_type_name (itype));
}
g_free (interfaces);
dump_properties (type, out);
dump_signals (type, out);
goutput_write (out, " </interface>\n");
}
static void
dump_boxed_type (GType type, const char *symbol, GOutputStream *out)
{
escaped_printf (out, " <boxed name=\"%s\" get-type=\"%s\"/>\n",
g_type_name (type), symbol);
}
static void
dump_flags_type (GType type, const char *symbol, GOutputStream *out)
{
guint i;
GFlagsClass *klass;
klass = g_type_class_ref (type);
escaped_printf (out, " <flags name=\"%s\" get-type=\"%s\">\n",
g_type_name (type), symbol);
for (i = 0; i < klass->n_values; i++)
{
GFlagsValue *value = &(klass->values[i]);
escaped_printf (out, " <member name=\"%s\" nick=\"%s\" value=\"%u\"/>\n",
value->value_name, value->value_nick, value->value);
}
goutput_write (out, " </flags>\n");
}
static void
dump_enum_type (GType type, const char *symbol, GOutputStream *out)
{
guint i;
GEnumClass *klass;
klass = g_type_class_ref (type);
escaped_printf (out, " <enum name=\"%s\" get-type=\"%s\">\n",
g_type_name (type), symbol);
for (i = 0; i < klass->n_values; i++)
{
GEnumValue *value = &(klass->values[i]);
escaped_printf (out, " <member name=\"%s\" nick=\"%s\" value=\"%d\"/>\n",
value->value_name, value->value_nick, value->value);
}
goutput_write (out, " </enum>");
}
static void
dump_fundamental_type (GType type, const char *symbol, GOutputStream *out)
{
guint n_interfaces;
guint i;
GType *interfaces;
GString *parent_str;
GType parent;
gboolean first = TRUE;
escaped_printf (out, " <fundamental name=\"%s\" get-type=\"%s\"",
g_type_name (type), symbol);
if (G_TYPE_IS_ABSTRACT (type))
escaped_printf (out, " abstract=\"1\"");
#if GLIB_CHECK_VERSION (2, 70, 0)
if (G_TYPE_IS_FINAL (type))
escaped_printf (out, " final=\"1\"");
#endif
if (G_TYPE_IS_INSTANTIATABLE (type))
escaped_printf (out, " instantiatable=\"1\"");
parent = g_type_parent (type);
parent_str = g_string_new ("");
while (parent != G_TYPE_INVALID)
{
if (first)
first = FALSE;
else
g_string_append_c (parent_str, ',');
if (!g_type_name (parent))
break;
g_string_append (parent_str, g_type_name (parent));
parent = g_type_parent (parent);
}
if (parent_str->len > 0)
escaped_printf (out, " parents=\"%s\"", parent_str->str);
g_string_free (parent_str, TRUE);
goutput_write (out, ">\n");
interfaces = g_type_interfaces (type, &n_interfaces);
for (i = 0; i < n_interfaces; i++)
{
GType itype = interfaces[i];
escaped_printf (out, " <implements name=\"%s\"/>\n",
g_type_name (itype));
}
g_free (interfaces);
goutput_write (out, " </fundamental>\n");
}
static void
dump_type (GType type, const char *symbol, GOutputStream *out)
{
switch (g_type_fundamental (type))
{
case G_TYPE_OBJECT:
dump_object_type (type, symbol, out);
break;
case G_TYPE_INTERFACE:
dump_interface_type (type, symbol, out);
break;
case G_TYPE_BOXED:
dump_boxed_type (type, symbol, out);
break;
case G_TYPE_FLAGS:
dump_flags_type (type, symbol, out);
break;
case G_TYPE_ENUM:
dump_enum_type (type, symbol, out);
break;
case G_TYPE_POINTER:
/* GValue, etc. Just skip them. */
break;
default:
dump_fundamental_type (type, symbol, out);
break;
}
}
static void
dump_error_quark (GQuark quark, const char *symbol, GOutputStream *out)
{
escaped_printf (out, " <error-quark function=\"%s\" domain=\"%s\"/>\n",
symbol, g_quark_to_string (quark));
}
/**
* g_irepository_dump:
* @arg: Comma-separated pair of input and output filenames
* @error: a %GError
*
* Argument specified is a comma-separated pair of filenames; i.e. of
* the form "input.txt,output.xml". The input file should be a
* UTF-8 Unix-line-ending text file, with each line containing either
* "get-type:" followed by the name of a GType _get_type function, or
* "error-quark:" followed by the name of an error quark function. No
* extra whitespace is allowed.
*
* The output file should already exist, but be empty. This function will
* overwrite its contents.
*
* Returns: %TRUE on success, %FALSE on error
*/
#ifndef GI_COMPILATION
static gboolean
dump_irepository (const char *arg, GError **error) G_GNUC_UNUSED;
static gboolean
dump_irepository (const char *arg, GError **error)
#else
gboolean
g_irepository_dump (const char *arg, GError **error)
#endif
{
GHashTable *output_types;
char **args;
GFile *input_file;
GFile *output_file;
GFileInputStream *input;
GFileOutputStream *output;
GDataInputStream *in;
GModule *self;
gboolean caught_error = FALSE;
self = g_module_open (NULL, 0);
if (!self)
{
g_set_error (error,
G_IO_ERROR,
G_IO_ERROR_FAILED,
"failed to open self: %s",
g_module_error ());
return FALSE;
}
args = g_strsplit (arg, ",", 2);
input_file = g_file_new_for_path (args[0]);
output_file = g_file_new_for_path (args[1]);
g_strfreev (args);
input = g_file_read (input_file, NULL, error);
g_object_unref (input_file);
if (input == NULL)
{
g_object_unref (output_file);
return FALSE;
}
output = g_file_replace (output_file, NULL, FALSE, 0, NULL, error);
g_object_unref (output_file);
if (output == NULL)
{
g_input_stream_close (G_INPUT_STREAM (input), NULL, NULL);
g_object_unref (input);
return FALSE;
}
goutput_write (G_OUTPUT_STREAM (output), "<?xml version=\"1.0\"?>\n");
goutput_write (G_OUTPUT_STREAM (output), "<dump>\n");
output_types = g_hash_table_new (NULL, NULL);
in = g_data_input_stream_new (G_INPUT_STREAM (input));
g_object_unref (input);
while (TRUE)
{
gsize len;
char *line = g_data_input_stream_read_line (in, &len, NULL, NULL);
const char *function;
if (line == NULL || *line == '\0')
{
g_free (line);
break;
}
g_strchomp (line);
if (strncmp (line, "get-type:", strlen ("get-type:")) == 0)
{
GType type;
function = line + strlen ("get-type:");
type = invoke_get_type (self, function, error);
if (type == G_TYPE_INVALID)
{
g_printerr ("Invalid GType function: '%s'\n", function);
caught_error = TRUE;
g_free (line);
break;
}
if (g_hash_table_lookup (output_types, (gpointer) type))
goto next;
g_hash_table_insert (output_types, (gpointer) type, (gpointer) type);
dump_type (type, function, G_OUTPUT_STREAM (output));
}
else if (strncmp (line, "error-quark:", strlen ("error-quark:")) == 0)
{
GQuark quark;
function = line + strlen ("error-quark:");
quark = invoke_error_quark (self, function, error);
if (quark == 0)
{
g_printerr ("Invalid error quark function: '%s'\n", function);
caught_error = TRUE;
g_free (line);
break;
}
dump_error_quark (quark, function, G_OUTPUT_STREAM (output));
}
next:
g_free (line);
}
g_hash_table_destroy (output_types);
goutput_write (G_OUTPUT_STREAM (output), "</dump>\n");
{
/* Avoid overwriting an earlier set error */
caught_error |= !g_input_stream_close (G_INPUT_STREAM (in), NULL,
caught_error ? NULL : error);
caught_error |= !g_output_stream_close (G_OUTPUT_STREAM (output), NULL,
caught_error ? NULL : error);
}
g_object_unref (in);
g_object_unref (output);
return !caught_error;
}

View File

@ -0,0 +1,70 @@
/* -*- mode: C; c-file-style: "gnu"; indent-tabs-mode: nil; -*-
* GObject introspection: typelib validation, auxiliary functions
* related to the binary typelib format
*
* Copyright (C) 2011 Colin Walters
* Copyright (C) 2020 Gisle Vanem
*
* SPDX-License-Identifier: LGPL-2.1-or-later
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 02111-1307, USA.
*/
#include "gdump.c"
#ifdef G_OS_WIN32
#include <windows.h>
#include <io.h> /* For _get_osfhandle() */
#include <gio/gwin32outputstream.h>
#else
#include <gio/gunixoutputstream.h>
#endif
int
main (int argc,
char **argv)
{
int i;
GOutputStream *Stdout;
GModule *self;
#if defined(G_OS_WIN32)
HANDLE *hnd = (HANDLE) _get_osfhandle (1);
g_return_val_if_fail (hnd && hnd != INVALID_HANDLE_VALUE, 1);
Stdout = g_win32_output_stream_new (hnd, FALSE);
#else
Stdout = g_unix_output_stream_new (1, FALSE);
#endif
self = g_module_open (NULL, 0);
for (i = 1; i < argc; i++)
{
GError *error = NULL;
GType type;
type = invoke_get_type (self, argv[i], &error);
if (!type)
{
g_printerr ("%s\n", error->message);
g_clear_error (&error);
}
else
dump_type (type, argv[i], Stdout);
}
return 0;
}

335
girepository/giarginfo.c Normal file
View File

@ -0,0 +1,335 @@
/* -*- mode: C; c-file-style: "gnu"; indent-tabs-mode: nil; -*-
* GObject introspection: Argument implementation
*
* Copyright (C) 2005 Matthias Clasen
* Copyright (C) 2008,2009 Red Hat, Inc.
*
* SPDX-License-Identifier: LGPL-2.1-or-later
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 02111-1307, USA.
*/
#include "config.h"
#include <glib.h>
#include "gitypelib-internal.h"
#include "girepository-private.h"
#include "giarginfo.h"
/* GIArgInfo functions */
/**
* SECTION:giarginfo
* @title: GIArgInfo
* @short_description: Struct representing an argument
*
* GIArgInfo represents an argument of a callable.
*
* An argument is always part of a #GICallableInfo.
*/
/**
* g_arg_info_get_direction:
* @info: a #GIArgInfo
*
* Obtain the direction of the argument. Check #GIDirection for possible
* direction values.
*
* Returns: the direction
*/
GIDirection
g_arg_info_get_direction (GIArgInfo *info)
{
GIRealInfo *rinfo = (GIRealInfo *)info;
ArgBlob *blob;
g_return_val_if_fail (info != NULL, -1);
g_return_val_if_fail (GI_IS_ARG_INFO (info), -1);
blob = (ArgBlob *)&rinfo->typelib->data[rinfo->offset];
if (blob->in && blob->out)
return GI_DIRECTION_INOUT;
else if (blob->out)
return GI_DIRECTION_OUT;
else
return GI_DIRECTION_IN;
}
/**
* g_arg_info_is_return_value:
* @info: a #GIArgInfo
*
* Obtain if the argument is a return value. It can either be a
* parameter or a return value.
*
* Returns: %TRUE if it is a return value
*/
gboolean
g_arg_info_is_return_value (GIArgInfo *info)
{
GIRealInfo *rinfo = (GIRealInfo *)info;
ArgBlob *blob;
g_return_val_if_fail (info != NULL, FALSE);
g_return_val_if_fail (GI_IS_ARG_INFO (info), FALSE);
blob = (ArgBlob *)&rinfo->typelib->data[rinfo->offset];
return blob->return_value;
}
/**
* g_arg_info_is_caller_allocates:
* @info: a #GIArgInfo
*
* Obtain if the argument is a pointer to a struct or object that will
* receive an output of a function. The default assumption for
* %GI_DIRECTION_OUT arguments which have allocation is that the
* callee allocates; if this is %TRUE, then the caller must allocate.
*
* Returns: %TRUE if caller is required to have allocated the argument
*/
gboolean
g_arg_info_is_caller_allocates (GIArgInfo *info)
{
GIRealInfo *rinfo = (GIRealInfo *)info;
ArgBlob *blob;
g_return_val_if_fail (info != NULL, FALSE);
g_return_val_if_fail (GI_IS_ARG_INFO (info), FALSE);
blob = (ArgBlob *)&rinfo->typelib->data[rinfo->offset];
return blob->caller_allocates;
}
/**
* g_arg_info_is_optional:
* @info: a #GIArgInfo
*
* Obtain if the argument is optional. For 'out' arguments this means
* that you can pass %NULL in order to ignore the result.
*
* Returns: %TRUE if it is an optional argument
*/
gboolean
g_arg_info_is_optional (GIArgInfo *info)
{
GIRealInfo *rinfo = (GIRealInfo *)info;
ArgBlob *blob;
g_return_val_if_fail (info != NULL, FALSE);
g_return_val_if_fail (GI_IS_ARG_INFO (info), FALSE);
blob = (ArgBlob *)&rinfo->typelib->data[rinfo->offset];
return blob->optional;
}
/**
* g_arg_info_may_be_null:
* @info: a #GIArgInfo
*
* Obtain if the type of the argument includes the possibility of %NULL.
* For 'in' values this means that %NULL is a valid value. For 'out'
* values, this means that %NULL may be returned.
*
* See also g_arg_info_is_optional().
*
* Returns: %TRUE if the value may be %NULL
*/
gboolean
g_arg_info_may_be_null (GIArgInfo *info)
{
GIRealInfo *rinfo = (GIRealInfo *)info;
ArgBlob *blob;
g_return_val_if_fail (info != NULL, FALSE);
g_return_val_if_fail (GI_IS_ARG_INFO (info), FALSE);
blob = (ArgBlob *)&rinfo->typelib->data[rinfo->offset];
return blob->nullable;
}
/**
* g_arg_info_is_skip:
* @info: a #GIArgInfo
*
* Obtain if an argument is only useful in C.
*
* Returns: %TRUE if argument is only useful in C.
* Since: 1.30
*/
gboolean
g_arg_info_is_skip (GIArgInfo *info)
{
GIRealInfo *rinfo = (GIRealInfo *)info;
ArgBlob *blob;
g_return_val_if_fail (info != NULL, FALSE);
g_return_val_if_fail (GI_IS_ARG_INFO (info), FALSE);
blob = (ArgBlob *)&rinfo->typelib->data[rinfo->offset];
return blob->skip;
}
/**
* g_arg_info_get_ownership_transfer:
* @info: a #GIArgInfo
*
* Obtain the ownership transfer for this argument.
* #GITransfer contains a list of possible values.
*
* Returns: the transfer
*/
GITransfer
g_arg_info_get_ownership_transfer (GIArgInfo *info)
{
GIRealInfo *rinfo = (GIRealInfo *)info;
ArgBlob *blob;
g_return_val_if_fail (info != NULL, -1);
g_return_val_if_fail (GI_IS_ARG_INFO (info), -1);
blob = (ArgBlob *)&rinfo->typelib->data[rinfo->offset];
if (blob->transfer_ownership)
return GI_TRANSFER_EVERYTHING;
else if (blob->transfer_container_ownership)
return GI_TRANSFER_CONTAINER;
else
return GI_TRANSFER_NOTHING;
}
/**
* g_arg_info_get_scope:
* @info: a #GIArgInfo
*
* Obtain the scope type for this argument. The scope type explains
* how a callback is going to be invoked, most importantly when
* the resources required to invoke it can be freed.
* #GIScopeType contains a list of possible values.
*
* Returns: the scope type
*/
GIScopeType
g_arg_info_get_scope (GIArgInfo *info)
{
GIRealInfo *rinfo = (GIRealInfo *)info;
ArgBlob *blob;
g_return_val_if_fail (info != NULL, -1);
g_return_val_if_fail (GI_IS_ARG_INFO (info), -1);
blob = (ArgBlob *)&rinfo->typelib->data[rinfo->offset];
return blob->scope;
}
/**
* g_arg_info_get_closure:
* @info: a #GIArgInfo
*
* Obtain the index of the user data argument. This is only valid
* for arguments which are callbacks.
*
* Returns: index of the user data argument or -1 if there is none
*/
gint
g_arg_info_get_closure (GIArgInfo *info)
{
GIRealInfo *rinfo = (GIRealInfo *)info;
ArgBlob *blob;
g_return_val_if_fail (info != NULL, -1);
g_return_val_if_fail (GI_IS_ARG_INFO (info), -1);
blob = (ArgBlob *)&rinfo->typelib->data[rinfo->offset];
return blob->closure;
}
/**
* g_arg_info_get_destroy:
* @info: a #GIArgInfo
*
* Obtains the index of the #GDestroyNotify argument. This is only valid
* for arguments which are callbacks.
*
* Returns: index of the #GDestroyNotify argument or -1 if there is none
*/
gint
g_arg_info_get_destroy (GIArgInfo *info)
{
GIRealInfo *rinfo = (GIRealInfo *)info;
ArgBlob *blob;
g_return_val_if_fail (info != NULL, -1);
g_return_val_if_fail (GI_IS_ARG_INFO (info), -1);
blob = (ArgBlob *)&rinfo->typelib->data[rinfo->offset];
return blob->destroy;
}
/**
* g_arg_info_get_type:
* @info: a #GIArgInfo
*
* Obtain the type information for @info.
*
* Returns: (transfer full): the #GITypeInfo holding the type
* information for @info, free it with g_base_info_unref()
* when done.
*/
GITypeInfo *
g_arg_info_get_type (GIArgInfo *info)
{
GIRealInfo *rinfo = (GIRealInfo *)info;
g_return_val_if_fail (info != NULL, NULL);
g_return_val_if_fail (GI_IS_ARG_INFO (info), NULL);
return _g_type_info_new ((GIBaseInfo*)info, rinfo->typelib, rinfo->offset + G_STRUCT_OFFSET (ArgBlob, arg_type));
}
/**
* g_arg_info_load_type:
* @info: a #GIArgInfo
* @type: (out caller-allocates): Initialized with information about type of @info
*
* Obtain information about a the type of given argument @info; this
* function is a variant of g_arg_info_get_type() designed for stack
* allocation.
*
* The initialized @type must not be referenced after @info is deallocated.
*/
void
g_arg_info_load_type (GIArgInfo *info,
GITypeInfo *type)
{
GIRealInfo *rinfo = (GIRealInfo*) info;
g_return_if_fail (info != NULL);
g_return_if_fail (GI_IS_ARG_INFO (info));
_g_type_info_init (type, (GIBaseInfo*)info, rinfo->typelib, rinfo->offset + G_STRUCT_OFFSET (ArgBlob, arg_type));
}

81
girepository/giarginfo.h Normal file
View File

@ -0,0 +1,81 @@
/* -*- mode: C; c-file-style: "gnu"; indent-tabs-mode: nil; -*-
* GObject introspection: Argument
*
* Copyright (C) 2005 Matthias Clasen
* Copyright (C) 2008,2009 Red Hat, Inc.
*
* SPDX-License-Identifier: LGPL-2.1-or-later
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 02111-1307, USA.
*/
#pragma once
#if !defined (__GIREPOSITORY_H_INSIDE__) && !defined (GI_COMPILATION)
#error "Only <girepository.h> can be included directly."
#endif
#include <girepository/gitypes.h>
G_BEGIN_DECLS
/**
* GI_IS_ARG_INFO
* @info: an info structure
*
* Checks if @info is a GIArgInfo.
*/
#define GI_IS_ARG_INFO(info) \
(g_base_info_get_type((GIBaseInfo*)info) == GI_INFO_TYPE_ARG)
GI_AVAILABLE_IN_ALL
GIDirection g_arg_info_get_direction (GIArgInfo *info);
GI_AVAILABLE_IN_ALL
gboolean g_arg_info_is_return_value (GIArgInfo *info);
GI_AVAILABLE_IN_ALL
gboolean g_arg_info_is_optional (GIArgInfo *info);
GI_AVAILABLE_IN_ALL
gboolean g_arg_info_is_caller_allocates (GIArgInfo *info);
GI_AVAILABLE_IN_ALL
gboolean g_arg_info_may_be_null (GIArgInfo *info);
GI_AVAILABLE_IN_ALL
gboolean g_arg_info_is_skip (GIArgInfo *info);
GI_AVAILABLE_IN_ALL
GITransfer g_arg_info_get_ownership_transfer (GIArgInfo *info);
GI_AVAILABLE_IN_ALL
GIScopeType g_arg_info_get_scope (GIArgInfo *info);
GI_AVAILABLE_IN_ALL
gint g_arg_info_get_closure (GIArgInfo *info);
GI_AVAILABLE_IN_ALL
gint g_arg_info_get_destroy (GIArgInfo *info);
GI_AVAILABLE_IN_ALL
GITypeInfo * g_arg_info_get_type (GIArgInfo *info);
GI_AVAILABLE_IN_ALL
void g_arg_info_load_type (GIArgInfo *info,
GITypeInfo *type);
G_END_DECLS

677
girepository/gibaseinfo.c Normal file
View File

@ -0,0 +1,677 @@
/* -*- mode: C; c-file-style: "gnu"; indent-tabs-mode: nil; -*-
* GObject introspection: Base struct implementation
*
* Copyright (C) 2005 Matthias Clasen
* Copyright (C) 2008,2009 Red Hat, Inc.
*
* SPDX-License-Identifier: LGPL-2.1-or-later
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 02111-1307, USA.
*/
#include "config.h"
#include <stdlib.h>
#include <string.h>
#include <glib.h>
#include <glib-object.h>
#include "gitypelib-internal.h"
#include "girepository-private.h"
#include "gibaseinfo.h"
#define INVALID_REFCOUNT 0x7FFFFFFF
/* GBoxed registration of BaseInfo. */
GType
g_base_info_gtype_get_type (void)
{
static GType our_type = 0;
if (our_type == 0)
our_type =
g_boxed_type_register_static ("GIBaseInfo",
(GBoxedCopyFunc) g_base_info_ref,
(GBoxedFreeFunc) g_base_info_unref);
return our_type;
}
/* info creation */
GIBaseInfo *
_g_info_new_full (GIInfoType type,
GIRepository *repository,
GIBaseInfo *container,
GITypelib *typelib,
guint32 offset)
{
GIRealInfo *info;
g_return_val_if_fail (container != NULL || repository != NULL, NULL);
info = g_slice_new (GIRealInfo);
_g_info_init (info, type, repository, container, typelib, offset);
info->ref_count = 1;
if (container && ((GIRealInfo *) container)->ref_count != INVALID_REFCOUNT)
g_base_info_ref (info->container);
g_object_ref (info->repository);
return (GIBaseInfo*)info;
}
/**
* g_info_new:
* @type: TODO
* @container: TODO
* @typelib: TODO
* @offset: TODO
*
* TODO
*
* Returns: TODO
*/
GIBaseInfo *
g_info_new (GIInfoType type,
GIBaseInfo *container,
GITypelib *typelib,
guint32 offset)
{
return _g_info_new_full (type, ((GIRealInfo*)container)->repository, container, typelib, offset);
}
void
_g_info_init (GIRealInfo *info,
GIInfoType type,
GIRepository *repository,
GIBaseInfo *container,
GITypelib *typelib,
guint32 offset)
{
memset (info, 0, sizeof (GIRealInfo));
/* Invalid refcount used to flag stack-allocated infos */
info->ref_count = INVALID_REFCOUNT;
info->type = type;
info->typelib = typelib;
info->offset = offset;
if (container)
info->container = container;
g_assert (G_IS_IREPOSITORY (repository));
info->repository = repository;
}
GIBaseInfo *
_g_info_from_entry (GIRepository *repository,
GITypelib *typelib,
guint16 index)
{
GIBaseInfo *result;
DirEntry *entry = g_typelib_get_dir_entry (typelib, index);
if (entry->local)
result = _g_info_new_full (entry->blob_type, repository, NULL, typelib, entry->offset);
else
{
const gchar *namespace = g_typelib_get_string (typelib, entry->offset);
const gchar *name = g_typelib_get_string (typelib, entry->name);
result = g_irepository_find_by_name (repository, namespace, name);
if (result == NULL)
{
GIUnresolvedInfo *unresolved;
unresolved = g_slice_new0 (GIUnresolvedInfo);
unresolved->type = GI_INFO_TYPE_UNRESOLVED;
unresolved->ref_count = 1;
unresolved->repository = g_object_ref (repository);
unresolved->container = NULL;
unresolved->name = name;
unresolved->namespace = namespace;
return (GIBaseInfo *)unresolved;
}
return (GIBaseInfo *)result;
}
return (GIBaseInfo *)result;
}
GITypeInfo *
_g_type_info_new (GIBaseInfo *container,
GITypelib *typelib,
guint32 offset)
{
SimpleTypeBlob *type = (SimpleTypeBlob *)&typelib->data[offset];
return (GITypeInfo *) g_info_new (GI_INFO_TYPE_TYPE, container, typelib,
(type->flags.reserved == 0 && type->flags.reserved2 == 0) ? offset : type->offset);
}
void
_g_type_info_init (GIBaseInfo *info,
GIBaseInfo *container,
GITypelib *typelib,
guint32 offset)
{
GIRealInfo *rinfo = (GIRealInfo*)container;
SimpleTypeBlob *type = (SimpleTypeBlob *)&typelib->data[offset];
_g_info_init ((GIRealInfo*)info, GI_INFO_TYPE_TYPE, rinfo->repository, container, typelib,
(type->flags.reserved == 0 && type->flags.reserved2 == 0) ? offset : type->offset);
}
/* GIBaseInfo functions */
/**
* SECTION:gibaseinfo
* @title: GIBaseInfo
* @short_description: Base struct for all GITypelib structs
*
* GIBaseInfo is the common base struct of all other Info structs
* accessible through the #GIRepository API.
*
* All info structures can be cast to a #GIBaseInfo, for instance:
*
* |[<!-- language="C" -->
* GIFunctionInfo *function_info = ...;
* GIBaseInfo *info = (GIBaseInfo *) function_info;
* ]|
*
* Most #GIRepository APIs returning a #GIBaseInfo is actually
* creating a new struct; in other words, g_base_info_unref() has to
* be called when done accessing the data.
*
* #GIBaseInfo structuress are normally accessed by calling either
* g_irepository_find_by_name(), g_irepository_find_by_gtype() or
* g_irepository_get_info().
*
* |[<!-- language="C" -->
* GIBaseInfo *button_info =
* g_irepository_find_by_name (NULL, "Gtk", "Button");
*
* // ... use button_info ...
*
* g_base_info_unref (button_info);
* ]|
*
* ## Hierarchy
*
* |[<!-- language="plain" -->
* GIBaseInfo
* +---- GIArgInfo
* +---- GICallableInfo
* +---- GIConstantInfo
* +---- GIFieldInfo
* +---- GIPropertyInfo
* +---- GIRegisteredTypeInfo
* +---- GITypeInfo
* ]|
*/
/**
* g_base_info_ref: (skip)
* @info: a #GIBaseInfo
*
* Increases the reference count of @info.
*
* Returns: the same @info.
*/
GIBaseInfo *
g_base_info_ref (GIBaseInfo *info)
{
GIRealInfo *rinfo = (GIRealInfo*)info;
g_assert (rinfo->ref_count != INVALID_REFCOUNT);
g_atomic_int_inc (&rinfo->ref_count);
return info;
}
/**
* g_base_info_unref: (skip)
* @info: a #GIBaseInfo
*
* Decreases the reference count of @info. When its reference count
* drops to 0, the info is freed.
*/
void
g_base_info_unref (GIBaseInfo *info)
{
GIRealInfo *rinfo = (GIRealInfo*)info;
g_assert (rinfo->ref_count > 0 && rinfo->ref_count != INVALID_REFCOUNT);
if (!g_atomic_int_dec_and_test (&rinfo->ref_count))
return;
if (rinfo->container && ((GIRealInfo *) rinfo->container)->ref_count != INVALID_REFCOUNT)
g_base_info_unref (rinfo->container);
if (rinfo->repository)
g_object_unref (rinfo->repository);
if (rinfo->type == GI_INFO_TYPE_UNRESOLVED)
g_slice_free (GIUnresolvedInfo, (GIUnresolvedInfo *) rinfo);
else
g_slice_free (GIRealInfo, rinfo);
}
/**
* g_base_info_get_type:
* @info: a #GIBaseInfo
*
* Obtain the info type of the GIBaseInfo.
*
* Returns: the info type of @info
*/
GIInfoType
g_base_info_get_type (GIBaseInfo *info)
{
return ((GIRealInfo*)info)->type;
}
/**
* g_base_info_get_name:
* @info: a #GIBaseInfo
*
* Obtain the name of the @info. What the name represents depends on
* the #GIInfoType of the @info. For instance for #GIFunctionInfo it is
* the name of the function.
*
* Returns: the name of @info or %NULL if it lacks a name.
*/
const gchar *
g_base_info_get_name (GIBaseInfo *info)
{
GIRealInfo *rinfo = (GIRealInfo*)info;
g_assert (rinfo->ref_count > 0);
switch (rinfo->type)
{
case GI_INFO_TYPE_FUNCTION:
case GI_INFO_TYPE_CALLBACK:
case GI_INFO_TYPE_STRUCT:
case GI_INFO_TYPE_BOXED:
case GI_INFO_TYPE_ENUM:
case GI_INFO_TYPE_FLAGS:
case GI_INFO_TYPE_OBJECT:
case GI_INFO_TYPE_INTERFACE:
case GI_INFO_TYPE_CONSTANT:
case GI_INFO_TYPE_INVALID_0:
case GI_INFO_TYPE_UNION:
{
CommonBlob *blob = (CommonBlob *)&rinfo->typelib->data[rinfo->offset];
return g_typelib_get_string (rinfo->typelib, blob->name);
}
break;
case GI_INFO_TYPE_VALUE:
{
ValueBlob *blob = (ValueBlob *)&rinfo->typelib->data[rinfo->offset];
return g_typelib_get_string (rinfo->typelib, blob->name);
}
break;
case GI_INFO_TYPE_SIGNAL:
{
SignalBlob *blob = (SignalBlob *)&rinfo->typelib->data[rinfo->offset];
return g_typelib_get_string (rinfo->typelib, blob->name);
}
break;
case GI_INFO_TYPE_PROPERTY:
{
PropertyBlob *blob = (PropertyBlob *)&rinfo->typelib->data[rinfo->offset];
return g_typelib_get_string (rinfo->typelib, blob->name);
}
break;
case GI_INFO_TYPE_VFUNC:
{
VFuncBlob *blob = (VFuncBlob *)&rinfo->typelib->data[rinfo->offset];
return g_typelib_get_string (rinfo->typelib, blob->name);
}
break;
case GI_INFO_TYPE_FIELD:
{
FieldBlob *blob = (FieldBlob *)&rinfo->typelib->data[rinfo->offset];
return g_typelib_get_string (rinfo->typelib, blob->name);
}
break;
case GI_INFO_TYPE_ARG:
{
ArgBlob *blob = (ArgBlob *)&rinfo->typelib->data[rinfo->offset];
return g_typelib_get_string (rinfo->typelib, blob->name);
}
break;
case GI_INFO_TYPE_UNRESOLVED:
{
GIUnresolvedInfo *unresolved = (GIUnresolvedInfo *)info;
return unresolved->name;
}
break;
case GI_INFO_TYPE_TYPE:
return NULL;
default: ;
g_assert_not_reached ();
/* unnamed */
}
return NULL;
}
/**
* g_base_info_get_namespace:
* @info: a #GIBaseInfo
*
* Obtain the namespace of @info.
*
* Returns: the namespace
*/
const gchar *
g_base_info_get_namespace (GIBaseInfo *info)
{
GIRealInfo *rinfo = (GIRealInfo*) info;
Header *header = (Header *)rinfo->typelib->data;
g_assert (rinfo->ref_count > 0);
if (rinfo->type == GI_INFO_TYPE_UNRESOLVED)
{
GIUnresolvedInfo *unresolved = (GIUnresolvedInfo *)info;
return unresolved->namespace;
}
return g_typelib_get_string (rinfo->typelib, header->namespace);
}
/**
* g_base_info_is_deprecated:
* @info: a #GIBaseInfo
*
* Obtain whether the @info is represents a metadata which is
* deprecated or not.
*
* Returns: %TRUE if deprecated
*/
gboolean
g_base_info_is_deprecated (GIBaseInfo *info)
{
GIRealInfo *rinfo = (GIRealInfo*) info;
switch (rinfo->type)
{
case GI_INFO_TYPE_FUNCTION:
case GI_INFO_TYPE_CALLBACK:
case GI_INFO_TYPE_STRUCT:
case GI_INFO_TYPE_BOXED:
case GI_INFO_TYPE_ENUM:
case GI_INFO_TYPE_FLAGS:
case GI_INFO_TYPE_OBJECT:
case GI_INFO_TYPE_INTERFACE:
case GI_INFO_TYPE_CONSTANT:
case GI_INFO_TYPE_INVALID_0:
{
CommonBlob *blob = (CommonBlob *)&rinfo->typelib->data[rinfo->offset];
return blob->deprecated;
}
break;
case GI_INFO_TYPE_VALUE:
{
ValueBlob *blob = (ValueBlob *)&rinfo->typelib->data[rinfo->offset];
return blob->deprecated;
}
break;
case GI_INFO_TYPE_SIGNAL:
{
SignalBlob *blob = (SignalBlob *)&rinfo->typelib->data[rinfo->offset];
return blob->deprecated;
}
break;
case GI_INFO_TYPE_PROPERTY:
{
PropertyBlob *blob = (PropertyBlob *)&rinfo->typelib->data[rinfo->offset];
return blob->deprecated;
}
break;
case GI_INFO_TYPE_VFUNC:
case GI_INFO_TYPE_FIELD:
case GI_INFO_TYPE_ARG:
case GI_INFO_TYPE_TYPE:
default: ;
/* no deprecation flag for these */
}
return FALSE;
}
/**
* g_base_info_get_attribute:
* @info: a #GIBaseInfo
* @name: a freeform string naming an attribute
*
* Retrieve an arbitrary attribute associated with this node.
*
* Returns: The value of the attribute, or %NULL if no such attribute exists
*/
const gchar *
g_base_info_get_attribute (GIBaseInfo *info,
const gchar *name)
{
GIAttributeIter iter = { 0, };
gchar *curname, *curvalue;
while (g_base_info_iterate_attributes (info, &iter, &curname, &curvalue))
{
if (strcmp (name, curname) == 0)
return (const gchar*) curvalue;
}
return NULL;
}
static int
cmp_attribute (const void *av,
const void *bv)
{
const AttributeBlob *a = av;
const AttributeBlob *b = bv;
if (a->offset < b->offset)
return -1;
else if (a->offset == b->offset)
return 0;
else
return 1;
}
/*
* _attribute_blob_find_first:
* @GIBaseInfo: A #GIBaseInfo.
* @blob_offset: The offset for the blob to find the first attribute for.
*
* Searches for the first #AttributeBlob for @blob_offset and returns
* it if found.
*
* Returns: A pointer to #AttributeBlob or %NULL if not found.
*/
AttributeBlob *
_attribute_blob_find_first (GIBaseInfo *info,
guint32 blob_offset)
{
GIRealInfo *rinfo = (GIRealInfo *) info;
Header *header = (Header *)rinfo->typelib->data;
AttributeBlob blob, *first, *res, *previous;
blob.offset = blob_offset;
first = (AttributeBlob *) &rinfo->typelib->data[header->attributes];
res = bsearch (&blob, first, header->n_attributes,
header->attribute_blob_size, cmp_attribute);
if (res == NULL)
return NULL;
previous = res - 1;
while (previous >= first && previous->offset == blob_offset)
{
res = previous;
previous = res - 1;
}
return res;
}
/**
* g_base_info_iterate_attributes:
* @info: a #GIBaseInfo
* @iterator: (inout): a #GIAttributeIter structure, must be initialized; see below
* @name: (out) (transfer none): Returned name, must not be freed
* @value: (out) (transfer none): Returned name, must not be freed
*
* Iterate over all attributes associated with this node. The iterator
* structure is typically stack allocated, and must have its first
* member initialized to %NULL. Attributes are arbitrary namespaced keyvalue
* pairs which can be attached to almost any item. They are intended for use
* by software higher in the toolchain than bindings, and are distinct from
* normal GIR annotations.
*
* Both the @name and @value should be treated as constants
* and must not be freed.
*
* |[<!-- language="C" -->
* void
* print_attributes (GIBaseInfo *info)
* {
* GIAttributeIter iter = { 0, };
* char *name;
* char *value;
* while (g_base_info_iterate_attributes (info, &iter, &name, &value))
* {
* g_print ("attribute name: %s value: %s", name, value);
* }
* }
* ]|
*
* Returns: %TRUE if there are more attributes
*/
gboolean
g_base_info_iterate_attributes (GIBaseInfo *info,
GIAttributeIter *iterator,
gchar **name,
gchar **value)
{
GIRealInfo *rinfo = (GIRealInfo *)info;
Header *header = (Header *)rinfo->typelib->data;
AttributeBlob *next, *after;
after = (AttributeBlob *) &rinfo->typelib->data[header->attributes +
header->n_attributes * header->attribute_blob_size];
if (iterator->data != NULL)
next = (AttributeBlob *) iterator->data;
else
next = _attribute_blob_find_first (info, rinfo->offset);
if (next == NULL || next->offset != rinfo->offset || next >= after)
return FALSE;
*name = (gchar*) g_typelib_get_string (rinfo->typelib, next->name);
*value = (gchar*) g_typelib_get_string (rinfo->typelib, next->value);
iterator->data = next + 1;
return TRUE;
}
/**
* g_base_info_get_container:
* @info: a #GIBaseInfo
*
* Obtain the container of the @info. The container is the parent
* GIBaseInfo. For instance, the parent of a #GIFunctionInfo is an
* #GIObjectInfo or #GIInterfaceInfo.
*
* Returns: (transfer none): the container
*/
GIBaseInfo *
g_base_info_get_container (GIBaseInfo *info)
{
return ((GIRealInfo*)info)->container;
}
/**
* g_base_info_get_typelib:
* @info: a #GIBaseInfo
*
* Obtain the typelib this @info belongs to
*
* Returns: (transfer none): the typelib.
*/
GITypelib *
g_base_info_get_typelib (GIBaseInfo *info)
{
return ((GIRealInfo*)info)->typelib;
}
/**
* g_base_info_equal:
* @info1: a #GIBaseInfo
* @info2: a #GIBaseInfo
*
* Compare two #GIBaseInfo.
*
* Using pointer comparison is not practical since many functions return
* different instances of #GIBaseInfo that refers to the same part of the
* TypeLib; use this function instead to do #GIBaseInfo comparisons.
*
* Returns: %TRUE if and only if @info1 equals @info2.
*/
gboolean
g_base_info_equal (GIBaseInfo *info1, GIBaseInfo *info2)
{
/* Compare the TypeLib pointers, which are mmapped. */
GIRealInfo *rinfo1 = (GIRealInfo*)info1;
GIRealInfo *rinfo2 = (GIRealInfo*)info2;
return rinfo1->typelib->data + rinfo1->offset == rinfo2->typelib->data + rinfo2->offset;
}

101
girepository/gibaseinfo.h Normal file
View File

@ -0,0 +1,101 @@
/* -*- mode: C; c-file-style: "gnu"; indent-tabs-mode: nil; -*-
* GObject introspection: GIBaseInfo
*
* Copyright (C) 2005 Matthias Clasen
* Copyright (C) 2008,2009 Red Hat, Inc.
*
* SPDX-License-Identifier: LGPL-2.1-or-later
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 02111-1307, USA.
*/
#pragma once
#if !defined (__GIREPOSITORY_H_INSIDE__) && !defined (GI_COMPILATION)
#error "Only <girepository.h> can be included directly."
#endif
#include <glib-object.h>
#include <girepository/gitypelib.h>
#include <girepository/gitypes.h>
G_BEGIN_DECLS
/**
* GIAttributeIter:
*
* An opaque structure used to iterate over attributes
* in a #GIBaseInfo struct.
*/
typedef struct {
/*< private >*/
gpointer data;
gpointer data2;
gpointer data3;
gpointer data4;
} GIAttributeIter;
#define GI_TYPE_BASE_INFO (g_base_info_gtype_get_type ())
GI_AVAILABLE_IN_ALL
GType g_base_info_gtype_get_type (void) G_GNUC_CONST;
GI_AVAILABLE_IN_ALL
GIBaseInfo * g_base_info_ref (GIBaseInfo *info);
GI_AVAILABLE_IN_ALL
void g_base_info_unref (GIBaseInfo *info);
GI_AVAILABLE_IN_ALL
GIInfoType g_base_info_get_type (GIBaseInfo *info);
GI_AVAILABLE_IN_ALL
const gchar * g_base_info_get_name (GIBaseInfo *info);
GI_AVAILABLE_IN_ALL
const gchar * g_base_info_get_namespace (GIBaseInfo *info);
GI_AVAILABLE_IN_ALL
gboolean g_base_info_is_deprecated (GIBaseInfo *info);
GI_AVAILABLE_IN_ALL
const gchar * g_base_info_get_attribute (GIBaseInfo *info,
const gchar *name);
GI_AVAILABLE_IN_ALL
gboolean g_base_info_iterate_attributes (GIBaseInfo *info,
GIAttributeIter *iterator,
char **name,
char **value);
GI_AVAILABLE_IN_ALL
GIBaseInfo * g_base_info_get_container (GIBaseInfo *info);
GI_AVAILABLE_IN_ALL
GITypelib * g_base_info_get_typelib (GIBaseInfo *info);
GI_AVAILABLE_IN_ALL
gboolean g_base_info_equal (GIBaseInfo *info1,
GIBaseInfo *info2);
GI_AVAILABLE_IN_ALL
GIBaseInfo * g_info_new (GIInfoType type,
GIBaseInfo *container,
GITypelib *typelib,
guint32 offset);
G_END_DECLS

View File

@ -0,0 +1,793 @@
/* -*- mode: C; c-file-style: "gnu"; indent-tabs-mode: nil; -*-
* GObject introspection: Callable implementation
*
* Copyright (C) 2005 Matthias Clasen
* Copyright (C) 2008,2009 Red Hat, Inc.
*
* SPDX-License-Identifier: LGPL-2.1-or-later
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 02111-1307, USA.
*/
#include "config.h"
#include <stdlib.h>
#include <glib.h>
#include <girepository/girepository.h>
#include "girepository-private.h"
#include "gitypelib-internal.h"
#include "girffi.h"
#include "gicallableinfo.h"
/* GICallableInfo functions */
/**
* SECTION:gicallableinfo
* @title: GICallableInfo
* @short_description: Struct representing a callable
*
* GICallableInfo represents an entity which is callable.
*
* Examples of callable are:
*
* - functions (#GIFunctionInfo)
* - virtual functions (#GIVFuncInfo)
* - callbacks (#GICallbackInfo).
*
* A callable has a list of arguments (#GIArgInfo), a return type,
* direction and a flag which decides if it returns null.
*/
static guint32
signature_offset (GICallableInfo *info)
{
GIRealInfo *rinfo = (GIRealInfo*)info;
int sigoff = -1;
switch (rinfo->type)
{
case GI_INFO_TYPE_FUNCTION:
sigoff = G_STRUCT_OFFSET (FunctionBlob, signature);
break;
case GI_INFO_TYPE_VFUNC:
sigoff = G_STRUCT_OFFSET (VFuncBlob, signature);
break;
case GI_INFO_TYPE_CALLBACK:
sigoff = G_STRUCT_OFFSET (CallbackBlob, signature);
break;
case GI_INFO_TYPE_SIGNAL:
sigoff = G_STRUCT_OFFSET (SignalBlob, signature);
break;
default:
g_assert_not_reached ();
}
if (sigoff >= 0)
return *(guint32 *)&rinfo->typelib->data[rinfo->offset + sigoff];
return 0;
}
/**
* g_callable_info_can_throw_gerror:
* @info: a #GICallableInfo
*
* TODO
*
* Since: 1.34
* Returns: %TRUE if this #GICallableInfo can throw a #GError
*/
gboolean
g_callable_info_can_throw_gerror (GICallableInfo *info)
{
GIRealInfo *rinfo = (GIRealInfo*)info;
SignatureBlob *signature;
signature = (SignatureBlob *)&rinfo->typelib->data[signature_offset (info)];
if (signature->throws)
return TRUE;
/* Functions and VFuncs store "throws" in their own blobs.
* This info was additionally added to the SignatureBlob
* to support the other callables. For Functions and VFuncs,
* also check their legacy flag for compatibility.
*/
switch (rinfo->type) {
case GI_INFO_TYPE_FUNCTION:
{
FunctionBlob *blob;
blob = (FunctionBlob *)&rinfo->typelib->data[rinfo->offset];
return blob->throws;
}
case GI_INFO_TYPE_VFUNC:
{
VFuncBlob *blob;
blob = (VFuncBlob *)&rinfo->typelib->data[rinfo->offset];
return blob->throws;
}
case GI_INFO_TYPE_CALLBACK:
case GI_INFO_TYPE_SIGNAL:
return FALSE;
default:
g_assert_not_reached ();
}
}
/**
* g_callable_info_is_method:
* @info: a #GICallableInfo
*
* Determines if the callable info is a method. For #GIVFuncInfo<!-- -->s,
* #GICallbackInfo<!-- -->s, and #GISignalInfo<!-- -->s,
* this is always true. Otherwise, this looks at the %GI_FUNCTION_IS_METHOD
* flag on the #GIFunctionInfo.
*
* Concretely, this function returns whether g_callable_info_get_n_args()
* matches the number of arguments in the raw C method. For methods, there
* is one more C argument than is exposed by introspection: the "self"
* or "this" object.
*
* Returns: %TRUE if @info is a method, %FALSE otherwise
* Since: 1.34
*/
gboolean
g_callable_info_is_method (GICallableInfo *info)
{
GIRealInfo *rinfo = (GIRealInfo*)info;
switch (rinfo->type) {
case GI_INFO_TYPE_FUNCTION:
{
FunctionBlob *blob;
blob = (FunctionBlob *)&rinfo->typelib->data[rinfo->offset];
return (!blob->constructor && !blob->is_static);
}
case GI_INFO_TYPE_VFUNC:
case GI_INFO_TYPE_SIGNAL:
return TRUE;
case GI_INFO_TYPE_CALLBACK:
return FALSE;
default:
g_assert_not_reached ();
}
}
/**
* g_callable_info_get_return_type:
* @info: a #GICallableInfo
*
* Obtain the return type of a callable item as a #GITypeInfo.
*
* Returns: (transfer full): the #GITypeInfo. Free the struct by calling
* g_base_info_unref() when done.
*/
GITypeInfo *
g_callable_info_get_return_type (GICallableInfo *info)
{
GIRealInfo *rinfo = (GIRealInfo *)info;
guint32 offset;
g_return_val_if_fail (info != NULL, NULL);
g_return_val_if_fail (GI_IS_CALLABLE_INFO (info), NULL);
offset = signature_offset (info);
return _g_type_info_new ((GIBaseInfo*)info, rinfo->typelib, offset);
}
/**
* g_callable_info_load_return_type:
* @info: a #GICallableInfo
* @type: (out caller-allocates): Initialized with return type of @info
*
* Obtain information about a return value of callable; this
* function is a variant of g_callable_info_get_return_type() designed for stack
* allocation.
*
* The initialized @type must not be referenced after @info is deallocated.
*/
void
g_callable_info_load_return_type (GICallableInfo *info,
GITypeInfo *type)
{
GIRealInfo *rinfo = (GIRealInfo *)info;
guint32 offset;
g_return_if_fail (info != NULL);
g_return_if_fail (GI_IS_CALLABLE_INFO (info));
offset = signature_offset (info);
_g_type_info_init (type, (GIBaseInfo*)info, rinfo->typelib, offset);
}
/**
* g_callable_info_may_return_null:
* @info: a #GICallableInfo
*
* See if a callable could return %NULL.
*
* Returns: %TRUE if callable could return %NULL
*/
gboolean
g_callable_info_may_return_null (GICallableInfo *info)
{
GIRealInfo *rinfo = (GIRealInfo *)info;
SignatureBlob *blob;
g_return_val_if_fail (info != NULL, FALSE);
g_return_val_if_fail (GI_IS_CALLABLE_INFO (info), FALSE);
blob = (SignatureBlob *)&rinfo->typelib->data[signature_offset (info)];
return blob->may_return_null;
}
/**
* g_callable_info_skip_return:
* @info: a #GICallableInfo
*
* See if a callable's return value is only useful in C.
*
* Returns: %TRUE if return value is only useful in C.
*/
gboolean
g_callable_info_skip_return (GICallableInfo *info)
{
GIRealInfo *rinfo = (GIRealInfo *)info;
SignatureBlob *blob;
g_return_val_if_fail (info != NULL, FALSE);
g_return_val_if_fail (GI_IS_CALLABLE_INFO (info), FALSE);
blob = (SignatureBlob *)&rinfo->typelib->data[signature_offset (info)];
return blob->skip_return;
}
/**
* g_callable_info_get_caller_owns:
* @info: a #GICallableInfo
*
* See whether the caller owns the return value of this callable.
* #GITransfer contains a list of possible transfer values.
*
* Returns: the transfer mode for the return value of the callable
*/
GITransfer
g_callable_info_get_caller_owns (GICallableInfo *info)
{
GIRealInfo *rinfo = (GIRealInfo*) info;
SignatureBlob *blob;
g_return_val_if_fail (info != NULL, -1);
g_return_val_if_fail (GI_IS_CALLABLE_INFO (info), -1);
blob = (SignatureBlob *)&rinfo->typelib->data[signature_offset (info)];
if (blob->caller_owns_return_value)
return GI_TRANSFER_EVERYTHING;
else if (blob->caller_owns_return_container)
return GI_TRANSFER_CONTAINER;
else
return GI_TRANSFER_NOTHING;
}
/**
* g_callable_info_get_instance_ownership_transfer:
* @info: a #GICallableInfo
*
* Obtains the ownership transfer for the instance argument.
* #GITransfer contains a list of possible transfer values.
*
* Since: 1.42
* Returns: the transfer mode of the instance argument
*/
GITransfer
g_callable_info_get_instance_ownership_transfer (GICallableInfo *info)
{
GIRealInfo *rinfo = (GIRealInfo*) info;
SignatureBlob *blob;
g_return_val_if_fail (info != NULL, -1);
g_return_val_if_fail (GI_IS_CALLABLE_INFO (info), -1);
blob = (SignatureBlob *)&rinfo->typelib->data[signature_offset (info)];
if (blob->instance_transfer_ownership)
return GI_TRANSFER_EVERYTHING;
else
return GI_TRANSFER_NOTHING;
}
/**
* g_callable_info_get_n_args:
* @info: a #GICallableInfo
*
* Obtain the number of arguments (both IN and OUT) for this callable.
*
* Returns: The number of arguments this callable expects.
*/
gint
g_callable_info_get_n_args (GICallableInfo *info)
{
GIRealInfo *rinfo = (GIRealInfo *)info;
gint offset;
SignatureBlob *blob;
g_return_val_if_fail (info != NULL, -1);
g_return_val_if_fail (GI_IS_CALLABLE_INFO (info), -1);
offset = signature_offset (info);
blob = (SignatureBlob *)&rinfo->typelib->data[offset];
return blob->n_arguments;
}
/**
* g_callable_info_get_arg:
* @info: a #GICallableInfo
* @n: the argument index to fetch
*
* Obtain information about a particular argument of this callable.
*
* Returns: (transfer full): the #GIArgInfo. Free it with
* g_base_info_unref() when done.
*/
GIArgInfo *
g_callable_info_get_arg (GICallableInfo *info,
gint n)
{
GIRealInfo *rinfo = (GIRealInfo *)info;
Header *header;
gint offset;
g_return_val_if_fail (info != NULL, NULL);
g_return_val_if_fail (GI_IS_CALLABLE_INFO (info), NULL);
offset = signature_offset (info);
header = (Header *)rinfo->typelib->data;
return (GIArgInfo *) g_info_new (GI_INFO_TYPE_ARG, (GIBaseInfo*)info, rinfo->typelib,
offset + header->signature_blob_size + n * header->arg_blob_size);
}
/**
* g_callable_info_load_arg:
* @info: a #GICallableInfo
* @n: the argument index to fetch
* @arg: (out caller-allocates): Initialize with argument number @n
*
* Obtain information about a particular argument of this callable; this
* function is a variant of g_callable_info_get_arg() designed for stack
* allocation.
*
* The initialized @arg must not be referenced after @info is deallocated.
*/
void
g_callable_info_load_arg (GICallableInfo *info,
gint n,
GIArgInfo *arg)
{
GIRealInfo *rinfo = (GIRealInfo *)info;
Header *header;
gint offset;
g_return_if_fail (info != NULL);
g_return_if_fail (GI_IS_CALLABLE_INFO (info));
offset = signature_offset (info);
header = (Header *)rinfo->typelib->data;
_g_info_init ((GIRealInfo*)arg, GI_INFO_TYPE_ARG, rinfo->repository, (GIBaseInfo*)info, rinfo->typelib,
offset + header->signature_blob_size + n * header->arg_blob_size);
}
/**
* g_callable_info_get_return_attribute:
* @info: a #GICallableInfo
* @name: a freeform string naming an attribute
*
* Retrieve an arbitrary attribute associated with the return value.
*
* Returns: The value of the attribute, or %NULL if no such attribute exists
*/
const gchar *
g_callable_info_get_return_attribute (GICallableInfo *info,
const gchar *name)
{
GIAttributeIter iter = { 0, };
gchar *curname, *curvalue;
while (g_callable_info_iterate_return_attributes (info, &iter, &curname, &curvalue))
{
if (g_strcmp0 (name, curname) == 0)
return (const gchar*) curvalue;
}
return NULL;
}
/**
* g_callable_info_iterate_return_attributes:
* @info: a #GICallableInfo
* @iterator: (inout): a #GIAttributeIter structure, must be initialized; see below
* @name: (out) (transfer none): Returned name, must not be freed
* @value: (out) (transfer none): Returned name, must not be freed
*
* Iterate over all attributes associated with the return value. The
* iterator structure is typically stack allocated, and must have its
* first member initialized to %NULL.
*
* Both the @name and @value should be treated as constants
* and must not be freed.
*
* See g_base_info_iterate_attributes() for an example of how to use a
* similar API.
*
* Returns: %TRUE if there are more attributes
*/
gboolean
g_callable_info_iterate_return_attributes (GICallableInfo *info,
GIAttributeIter *iterator,
char **name,
char **value)
{
GIRealInfo *rinfo = (GIRealInfo *)info;
Header *header = (Header *)rinfo->typelib->data;
AttributeBlob *next, *after;
guint32 blob_offset;
after = (AttributeBlob *) &rinfo->typelib->data[header->attributes +
header->n_attributes * header->attribute_blob_size];
blob_offset = signature_offset (info);
if (iterator->data != NULL)
next = (AttributeBlob *) iterator->data;
else
next = _attribute_blob_find_first (info, blob_offset);
if (next == NULL || next->offset != blob_offset || next >= after)
return FALSE;
*name = (gchar*) g_typelib_get_string (rinfo->typelib, next->name);
*value = (gchar*) g_typelib_get_string (rinfo->typelib, next->value);
iterator->data = next + 1;
return TRUE;
}
/**
* gi_type_tag_extract_ffi_return_value:
* @return_tag: #GITypeTag of the return value
* @interface_type: #GIInfoType of the underlying interface type
* @ffi_value: pointer to #GIFFIReturnValue union containing the return value
* from `ffi_call()`
* @arg: (out caller-allocates): pointer to an allocated #GIArgument
*
* Extract the correct bits from an `ffi_arg` return value into
* GIArgument.
*
* See: https://bugzilla.gnome.org/show_bug.cgi?id=665152
*
* Also see `ffi_call(3)`: the storage requirements for return values
* are "special".
*
* The @interface_type argument only applies if @return_tag is
* %GI_TYPE_TAG_INTERFACE. Otherwise it is ignored.
*
* Since: 1.72
*/
void
gi_type_tag_extract_ffi_return_value (GITypeTag return_tag,
GIInfoType interface_type,
GIFFIReturnValue *ffi_value,
GIArgument *arg)
{
switch (return_tag) {
case GI_TYPE_TAG_INT8:
arg->v_int8 = (gint8) ffi_value->v_long;
break;
case GI_TYPE_TAG_UINT8:
arg->v_uint8 = (guint8) ffi_value->v_ulong;
break;
case GI_TYPE_TAG_INT16:
arg->v_int16 = (gint16) ffi_value->v_long;
break;
case GI_TYPE_TAG_UINT16:
arg->v_uint16 = (guint16) ffi_value->v_ulong;
break;
case GI_TYPE_TAG_INT32:
arg->v_int32 = (gint32) ffi_value->v_long;
break;
case GI_TYPE_TAG_UINT32:
case GI_TYPE_TAG_BOOLEAN:
case GI_TYPE_TAG_UNICHAR:
arg->v_uint32 = (guint32) ffi_value->v_ulong;
break;
case GI_TYPE_TAG_INT64:
arg->v_int64 = (gint64) ffi_value->v_int64;
break;
case GI_TYPE_TAG_UINT64:
arg->v_uint64 = (guint64) ffi_value->v_uint64;
break;
case GI_TYPE_TAG_FLOAT:
arg->v_float = ffi_value->v_float;
break;
case GI_TYPE_TAG_DOUBLE:
arg->v_double = ffi_value->v_double;
break;
case GI_TYPE_TAG_INTERFACE:
switch(interface_type) {
case GI_INFO_TYPE_ENUM:
case GI_INFO_TYPE_FLAGS:
arg->v_int32 = (gint32) ffi_value->v_long;
break;
default:
arg->v_pointer = (gpointer) ffi_value->v_pointer;
break;
}
break;
default:
arg->v_pointer = (gpointer) ffi_value->v_pointer;
break;
}
}
/**
* gi_type_info_extract_ffi_return_value:
* @return_info: #GITypeInfo describing the return type
* @ffi_value: pointer to #GIFFIReturnValue union containing the return value
* from `ffi_call()`
* @arg: (out caller-allocates): pointer to an allocated #GIArgument
*
* Extract the correct bits from an `ffi_arg` return value into
* #GIArgument.
*
* See: https://bugzilla.gnome.org/show_bug.cgi?id=665152
*
* Also see `ffi_call(3)`: the storage requirements for return values
* are "special".
*/
void
gi_type_info_extract_ffi_return_value (GITypeInfo *return_info,
GIFFIReturnValue *ffi_value,
GIArgument *arg)
{
GITypeTag return_tag = g_type_info_get_tag (return_info);
GIInfoType interface_type = GI_INFO_TYPE_INVALID;
if (return_tag == GI_TYPE_TAG_INTERFACE)
{
GIBaseInfo *interface_info = g_type_info_get_interface (return_info);
interface_type = g_base_info_get_type (interface_info);
g_base_info_unref (interface_info);
}
gi_type_tag_extract_ffi_return_value (return_tag, interface_type,
ffi_value, arg);
}
/**
* g_callable_info_invoke:
* @info: TODO
* @function: TODO
* @in_args: (array length=n_in_args): TODO
* @n_in_args: TODO
* @out_args: (array length=n_out_args): TODO
* @n_out_args: TODO
* @return_value: TODO
* @is_method: TODO
* @throws: TODO
* @error: TODO
*
* TODO
*/
gboolean
g_callable_info_invoke (GIFunctionInfo *info,
gpointer function,
const GIArgument *in_args,
int n_in_args,
const GIArgument *out_args,
int n_out_args,
GIArgument *return_value,
gboolean is_method,
gboolean throws,
GError **error)
{
ffi_cif cif;
ffi_type *rtype;
ffi_type **atypes;
GITypeInfo *tinfo;
GITypeInfo *rinfo;
GITypeTag rtag;
GIArgInfo *ainfo;
gint n_args, n_invoke_args, in_pos, out_pos, i;
gpointer *args;
gboolean success = FALSE;
GError *local_error = NULL;
gpointer error_address = &local_error;
GIFFIReturnValue ffi_return_value;
gpointer return_value_p; /* Will point inside the union return_value */
rinfo = g_callable_info_get_return_type ((GICallableInfo *)info);
rtype = g_type_info_get_ffi_type (rinfo);
rtag = g_type_info_get_tag(rinfo);
in_pos = 0;
out_pos = 0;
n_args = g_callable_info_get_n_args ((GICallableInfo *)info);
if (is_method)
{
if (n_in_args == 0)
{
g_set_error (error,
G_INVOKE_ERROR,
G_INVOKE_ERROR_ARGUMENT_MISMATCH,
"Too few \"in\" arguments (handling this)");
goto out;
}
n_invoke_args = n_args+1;
in_pos++;
}
else
n_invoke_args = n_args;
if (throws)
/* Add an argument for the GError */
n_invoke_args ++;
atypes = g_alloca (sizeof (ffi_type*) * n_invoke_args);
args = g_alloca (sizeof (gpointer) * n_invoke_args);
if (is_method)
{
atypes[0] = &ffi_type_pointer;
args[0] = (gpointer) &in_args[0];
}
for (i = 0; i < n_args; i++)
{
int offset = (is_method ? 1 : 0);
ainfo = g_callable_info_get_arg ((GICallableInfo *)info, i);
switch (g_arg_info_get_direction (ainfo))
{
case GI_DIRECTION_IN:
tinfo = g_arg_info_get_type (ainfo);
atypes[i+offset] = g_type_info_get_ffi_type (tinfo);
g_base_info_unref ((GIBaseInfo *)ainfo);
g_base_info_unref ((GIBaseInfo *)tinfo);
if (in_pos >= n_in_args)
{
g_set_error (error,
G_INVOKE_ERROR,
G_INVOKE_ERROR_ARGUMENT_MISMATCH,
"Too few \"in\" arguments (handling in)");
goto out;
}
args[i+offset] = (gpointer)&in_args[in_pos];
in_pos++;
break;
case GI_DIRECTION_OUT:
atypes[i+offset] = &ffi_type_pointer;
g_base_info_unref ((GIBaseInfo *)ainfo);
if (out_pos >= n_out_args)
{
g_set_error (error,
G_INVOKE_ERROR,
G_INVOKE_ERROR_ARGUMENT_MISMATCH,
"Too few \"out\" arguments (handling out)");
goto out;
}
args[i+offset] = (gpointer)&out_args[out_pos];
out_pos++;
break;
case GI_DIRECTION_INOUT:
atypes[i+offset] = &ffi_type_pointer;
g_base_info_unref ((GIBaseInfo *)ainfo);
if (in_pos >= n_in_args)
{
g_set_error (error,
G_INVOKE_ERROR,
G_INVOKE_ERROR_ARGUMENT_MISMATCH,
"Too few \"in\" arguments (handling inout)");
goto out;
}
if (out_pos >= n_out_args)
{
g_set_error (error,
G_INVOKE_ERROR,
G_INVOKE_ERROR_ARGUMENT_MISMATCH,
"Too few \"out\" arguments (handling inout)");
goto out;
}
args[i+offset] = (gpointer)&in_args[in_pos];
in_pos++;
out_pos++;
break;
default:
g_base_info_unref ((GIBaseInfo *)ainfo);
g_assert_not_reached ();
}
}
if (throws)
{
args[n_invoke_args - 1] = &error_address;
atypes[n_invoke_args - 1] = &ffi_type_pointer;
}
if (in_pos < n_in_args)
{
g_set_error (error,
G_INVOKE_ERROR,
G_INVOKE_ERROR_ARGUMENT_MISMATCH,
"Too many \"in\" arguments (at end)");
goto out;
}
if (out_pos < n_out_args)
{
g_set_error (error,
G_INVOKE_ERROR,
G_INVOKE_ERROR_ARGUMENT_MISMATCH,
"Too many \"out\" arguments (at end)");
goto out;
}
if (ffi_prep_cif (&cif, FFI_DEFAULT_ABI, n_invoke_args, rtype, atypes) != FFI_OK)
goto out;
g_return_val_if_fail (return_value, FALSE);
/* See comment for GIFFIReturnValue above */
switch (rtag)
{
case GI_TYPE_TAG_FLOAT:
return_value_p = &ffi_return_value.v_float;
break;
case GI_TYPE_TAG_DOUBLE:
return_value_p = &ffi_return_value.v_double;
break;
case GI_TYPE_TAG_INT64:
case GI_TYPE_TAG_UINT64:
return_value_p = &ffi_return_value.v_uint64;
break;
default:
return_value_p = &ffi_return_value.v_long;
}
ffi_call (&cif, function, return_value_p, args);
if (local_error)
{
g_propagate_error (error, local_error);
success = FALSE;
}
else
{
gi_type_info_extract_ffi_return_value (rinfo, &ffi_return_value, return_value);
success = TRUE;
}
out:
g_base_info_unref ((GIBaseInfo *)rinfo);
return success;
}

View File

@ -0,0 +1,107 @@
/* -*- mode: C; c-file-style: "gnu"; indent-tabs-mode: nil; -*-
* GObject introspection: Callable
*
* Copyright (C) 2005 Matthias Clasen
* Copyright (C) 2008,2009 Red Hat, Inc.
*
* SPDX-License-Identifier: LGPL-2.1-or-later
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 02111-1307, USA.
*/
#pragma once
#if !defined (__GIREPOSITORY_H_INSIDE__) && !defined (GI_COMPILATION)
#error "Only <girepository.h> can be included directly."
#endif
#include <girepository/gitypes.h>
G_BEGIN_DECLS
/**
* GI_IS_CALLABLE_INFO
* @info: an info structure
*
* Checks if @info is a #GICallableInfo or derived from it.
*/
#define GI_IS_CALLABLE_INFO(info) \
((g_base_info_get_type((GIBaseInfo*)info) == GI_INFO_TYPE_FUNCTION) || \
(g_base_info_get_type((GIBaseInfo*)info) == GI_INFO_TYPE_CALLBACK) || \
(g_base_info_get_type((GIBaseInfo*)info) == GI_INFO_TYPE_SIGNAL) || \
(g_base_info_get_type((GIBaseInfo*)info) == GI_INFO_TYPE_VFUNC))
GI_AVAILABLE_IN_ALL
gboolean g_callable_info_is_method (GICallableInfo *info);
GI_AVAILABLE_IN_ALL
gboolean g_callable_info_can_throw_gerror (GICallableInfo *info);
GI_AVAILABLE_IN_ALL
GITypeInfo * g_callable_info_get_return_type (GICallableInfo *info);
GI_AVAILABLE_IN_ALL
void g_callable_info_load_return_type (GICallableInfo *info,
GITypeInfo *type);
GI_AVAILABLE_IN_ALL
const gchar * g_callable_info_get_return_attribute (GICallableInfo *info,
const gchar *name);
GI_AVAILABLE_IN_ALL
gboolean g_callable_info_iterate_return_attributes (GICallableInfo *info,
GIAttributeIter *iterator,
char **name,
char **value);
GI_AVAILABLE_IN_ALL
GITransfer g_callable_info_get_caller_owns (GICallableInfo *info);
GI_AVAILABLE_IN_ALL
gboolean g_callable_info_may_return_null (GICallableInfo *info);
GI_AVAILABLE_IN_ALL
gboolean g_callable_info_skip_return (GICallableInfo *info);
GI_AVAILABLE_IN_ALL
gint g_callable_info_get_n_args (GICallableInfo *info);
GI_AVAILABLE_IN_ALL
GIArgInfo * g_callable_info_get_arg (GICallableInfo *info,
gint n);
GI_AVAILABLE_IN_ALL
void g_callable_info_load_arg (GICallableInfo *info,
gint n,
GIArgInfo *arg);
GI_AVAILABLE_IN_ALL
gboolean g_callable_info_invoke (GICallableInfo *info,
gpointer function,
const GIArgument *in_args,
int n_in_args,
const GIArgument *out_args,
int n_out_args,
GIArgument *return_value,
gboolean is_method,
gboolean throws,
GError **error);
GI_AVAILABLE_IN_ALL
GITransfer g_callable_info_get_instance_ownership_transfer (GICallableInfo *info);
G_END_DECLS

View File

@ -0,0 +1,182 @@
/* -*- mode: C; c-file-style: "gnu"; indent-tabs-mode: nil; -*-
* GObject introspection: Constant implementation
*
* Copyright (C) 2005 Matthias Clasen
* Copyright (C) 2008,2009 Red Hat, Inc.
*
* SPDX-License-Identifier: LGPL-2.1-or-later
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 02111-1307, USA.
*/
#include "config.h"
#include <glib.h>
#include <string.h> // memcpy
#include <girepository/girepository.h>
#include "girepository-private.h"
#include "gitypelib-internal.h"
#include "giconstantinfo.h"
/**
* SECTION:giconstantinfo
* @title: GIConstantInfo
* @short_description: Struct representing a constant
*
* GIConstantInfo represents a constant.
*
* A constant has a type associated which can be obtained by calling
* g_constant_info_get_type() and a value, which can be obtained by
* calling g_constant_info_get_value().
*/
/**
* g_constant_info_get_type:
* @info: a #GIConstantInfo
*
* Obtain the type of the constant as a #GITypeInfo.
*
* Returns: (transfer full): the #GITypeInfo. Free the struct by calling
* g_base_info_unref() when done.
*/
GITypeInfo *
g_constant_info_get_type (GIConstantInfo *info)
{
GIRealInfo *rinfo = (GIRealInfo *)info;
g_return_val_if_fail (info != NULL, NULL);
g_return_val_if_fail (GI_IS_CONSTANT_INFO (info), NULL);
return _g_type_info_new ((GIBaseInfo*)info, rinfo->typelib, rinfo->offset + 8);
}
#define DO_ALIGNED_COPY(dest_addr, src_addr, type) \
memcpy((dest_addr), (src_addr), sizeof(type))
/**
* g_constant_info_free_value: (skip)
* @info: a #GIConstantInfo
* @value: the argument
*
* Free the value returned from g_constant_info_get_value().
*
* Since: 1.32
*/
void
g_constant_info_free_value (GIConstantInfo *info,
GIArgument *value)
{
GIRealInfo *rinfo = (GIRealInfo *)info;
ConstantBlob *blob;
g_return_if_fail (info != NULL);
g_return_if_fail (GI_IS_CONSTANT_INFO (info));
blob = (ConstantBlob *)&rinfo->typelib->data[rinfo->offset];
/* FIXME non-basic types ? */
if (blob->type.flags.reserved == 0 && blob->type.flags.reserved2 == 0)
{
if (blob->type.flags.pointer)
g_free (value->v_pointer);
}
}
/**
* g_constant_info_get_value: (skip)
* @info: a #GIConstantInfo
* @value: (out): an argument
*
* Obtain the value associated with the #GIConstantInfo and store it in the
* @value parameter. @argument needs to be allocated before passing it in.
* The size of the constant value stored in @argument will be returned.
* Free the value with g_constant_info_free_value().
*
* Returns: size of the constant
*/
gint
g_constant_info_get_value (GIConstantInfo *info,
GIArgument *value)
{
GIRealInfo *rinfo = (GIRealInfo *)info;
ConstantBlob *blob;
g_return_val_if_fail (info != NULL, 0);
g_return_val_if_fail (GI_IS_CONSTANT_INFO (info), 0);
blob = (ConstantBlob *)&rinfo->typelib->data[rinfo->offset];
/* FIXME non-basic types ? */
if (blob->type.flags.reserved == 0 && blob->type.flags.reserved2 == 0)
{
if (blob->type.flags.pointer)
{
#if GLIB_CHECK_VERSION (2, 67, 5)
gsize blob_size = blob->size;
value->v_pointer = g_memdup2 (&rinfo->typelib->data[blob->offset], blob_size);
#else
value->v_pointer = g_memdup (&rinfo->typelib->data[blob->offset], blob->size);
#endif
}
else
{
switch (blob->type.flags.tag)
{
case GI_TYPE_TAG_BOOLEAN:
value->v_boolean = *(gboolean*)&rinfo->typelib->data[blob->offset];
break;
case GI_TYPE_TAG_INT8:
value->v_int8 = *(gint8*)&rinfo->typelib->data[blob->offset];
break;
case GI_TYPE_TAG_UINT8:
value->v_uint8 = *(guint8*)&rinfo->typelib->data[blob->offset];
break;
case GI_TYPE_TAG_INT16:
value->v_int16 = *(gint16*)&rinfo->typelib->data[blob->offset];
break;
case GI_TYPE_TAG_UINT16:
value->v_uint16 = *(guint16*)&rinfo->typelib->data[blob->offset];
break;
case GI_TYPE_TAG_INT32:
value->v_int32 = *(gint32*)&rinfo->typelib->data[blob->offset];
break;
case GI_TYPE_TAG_UINT32:
value->v_uint32 = *(guint32*)&rinfo->typelib->data[blob->offset];
break;
case GI_TYPE_TAG_INT64:
DO_ALIGNED_COPY(&value->v_int64, &rinfo->typelib->data[blob->offset], gint64);
break;
case GI_TYPE_TAG_UINT64:
DO_ALIGNED_COPY(&value->v_uint64, &rinfo->typelib->data[blob->offset], guint64);
break;
case GI_TYPE_TAG_FLOAT:
DO_ALIGNED_COPY(&value->v_float, &rinfo->typelib->data[blob->offset], gfloat);
break;
case GI_TYPE_TAG_DOUBLE:
DO_ALIGNED_COPY(&value->v_double, &rinfo->typelib->data[blob->offset], gdouble);
break;
default:
g_assert_not_reached ();
}
}
}
return blob->size;
}

View File

@ -0,0 +1,55 @@
/* -*- mode: C; c-file-style: "gnu"; indent-tabs-mode: nil; -*-
* GObject introspection: Constant
*
* Copyright (C) 2005 Matthias Clasen
* Copyright (C) 2008,2009 Red Hat, Inc.
*
* SPDX-License-Identifier: LGPL-2.1-or-later
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 02111-1307, USA.
*/
#pragma once
#if !defined (__GIREPOSITORY_H_INSIDE__) && !defined (GI_COMPILATION)
#error "Only <girepository.h> can be included directly."
#endif
#include <girepository/gitypes.h>
G_BEGIN_DECLS
/**
* GI_IS_CONSTANT_INFO
* @info: an info structure
*
* Checks if @info is a #GIConstantInfo.
*/
#define GI_IS_CONSTANT_INFO(info) \
(g_base_info_get_type((GIBaseInfo*)info) == GI_INFO_TYPE_CONSTANT)
GI_AVAILABLE_IN_ALL
GITypeInfo * g_constant_info_get_type (GIConstantInfo *info);
GI_AVAILABLE_IN_ALL
void g_constant_info_free_value(GIConstantInfo *info,
GIArgument *value);
GI_AVAILABLE_IN_ALL
gint g_constant_info_get_value(GIConstantInfo *info,
GIArgument *value);
G_END_DECLS

235
girepository/gienuminfo.c Normal file
View File

@ -0,0 +1,235 @@
/* -*- mode: C; c-file-style: "gnu"; indent-tabs-mode: nil; -*-
* GObject introspection: Enum implementation
*
* Copyright (C) 2005 Matthias Clasen
* Copyright (C) 2008,2009 Red Hat, Inc.
*
* SPDX-License-Identifier: LGPL-2.1-or-later
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 02111-1307, USA.
*/
#include "config.h"
#include <glib.h>
#include <girepository/girepository.h>
#include "girepository-private.h"
#include "gitypelib-internal.h"
#include "gienuminfo.h"
/**
* SECTION:gienuminfo
* @title: GIEnumInfo
* @short_description: Structs representing an enumeration and its values
*
* A GIEnumInfo represents an enumeration, and a GIValueInfo represents
* a value in the enumeration.
*
* The GIEnumInfo contains a set of values and a type.
*
* The GIValueInfo is fetched by calling g_enum_info_get_value() on
* a GIEnumInfo.
*/
/**
* g_enum_info_get_n_values:
* @info: a #GIEnumInfo
*
* Obtain the number of values this enumeration contains.
*
* Returns: the number of enumeration values
*/
gint
g_enum_info_get_n_values (GIEnumInfo *info)
{
GIRealInfo *rinfo = (GIRealInfo *)info;
EnumBlob *blob;
g_return_val_if_fail (info != NULL, 0);
g_return_val_if_fail (GI_IS_ENUM_INFO (info), 0);
blob = (EnumBlob *)&rinfo->typelib->data[rinfo->offset];
return blob->n_values;
}
/**
* g_enum_info_get_error_domain:
* @info: a #GIEnumInfo
*
* Obtain the string form of the quark for the error domain associated with
* this enum, if any.
*
* Returns: (transfer none): the string form of the error domain associated
* with this enum, or %NULL.
* Since: 1.30
*/
const gchar *
g_enum_info_get_error_domain (GIEnumInfo *info)
{
GIRealInfo *rinfo = (GIRealInfo *)info;
EnumBlob *blob;
g_return_val_if_fail (info != NULL, 0);
g_return_val_if_fail (GI_IS_ENUM_INFO (info), 0);
blob = (EnumBlob *)&rinfo->typelib->data[rinfo->offset];
if (blob->error_domain)
return g_typelib_get_string (rinfo->typelib, blob->error_domain);
else
return NULL;
}
/**
* g_enum_info_get_value:
* @info: a #GIEnumInfo
* @n: index of value to fetch
*
* Obtain a value for this enumeration.
*
* Returns: (transfer full): the enumeration value or %NULL if type tag is wrong,
* free the struct with g_base_info_unref() when done.
*/
GIValueInfo *
g_enum_info_get_value (GIEnumInfo *info,
gint n)
{
GIRealInfo *rinfo = (GIRealInfo *)info;
Header *header;
gint offset;
g_return_val_if_fail (info != NULL, NULL);
g_return_val_if_fail (GI_IS_ENUM_INFO (info), NULL);
header = (Header *)rinfo->typelib->data;
offset = rinfo->offset + header->enum_blob_size
+ n * header->value_blob_size;
return (GIValueInfo *) g_info_new (GI_INFO_TYPE_VALUE, (GIBaseInfo*)info, rinfo->typelib, offset);
}
/**
* g_enum_info_get_n_methods:
* @info: a #GIEnumInfo
*
* Obtain the number of methods that this enum type has.
*
* Returns: number of methods
* Since: 1.30
*/
gint
g_enum_info_get_n_methods (GIEnumInfo *info)
{
GIRealInfo *rinfo = (GIRealInfo *)info;
EnumBlob *blob;
g_return_val_if_fail (info != NULL, 0);
g_return_val_if_fail (GI_IS_ENUM_INFO (info), 0);
blob = (EnumBlob *)&rinfo->typelib->data[rinfo->offset];
return blob->n_methods;
}
/**
* g_enum_info_get_method:
* @info: a #GIEnumInfo
* @n: index of method to get
*
* Obtain an enum type method at index @n.
*
* Returns: (transfer full): the #GIFunctionInfo. Free the struct by calling
* g_base_info_unref() when done.
* Since: 1.30
*/
GIFunctionInfo *
g_enum_info_get_method (GIEnumInfo *info,
gint n)
{
gint offset;
GIRealInfo *rinfo = (GIRealInfo *)info;
Header *header;
EnumBlob *blob;
g_return_val_if_fail (info != NULL, NULL);
g_return_val_if_fail (GI_IS_ENUM_INFO (info), NULL);
header = (Header *)rinfo->typelib->data;
blob = (EnumBlob *)&rinfo->typelib->data[rinfo->offset];
offset = rinfo->offset + header->enum_blob_size
+ blob->n_values * header->value_blob_size
+ n * header->function_blob_size;
return (GIFunctionInfo *) g_info_new (GI_INFO_TYPE_FUNCTION, (GIBaseInfo*)info,
rinfo->typelib, offset);
}
/**
* g_enum_info_get_storage_type:
* @info: a #GIEnumInfo
*
* Obtain the tag of the type used for the enum in the C ABI. This will
* will be a signed or unsigned integral type.
*
* Note that in the current implementation the width of the type is
* computed correctly, but the signed or unsigned nature of the type
* may not match the sign of the type used by the C compiler.
*
* Returns: the storage type for the enumeration
*/
GITypeTag
g_enum_info_get_storage_type (GIEnumInfo *info)
{
GIRealInfo *rinfo = (GIRealInfo *)info;
EnumBlob *blob;
g_return_val_if_fail (info != NULL, GI_TYPE_TAG_BOOLEAN);
g_return_val_if_fail (GI_IS_ENUM_INFO (info), GI_TYPE_TAG_BOOLEAN);
blob = (EnumBlob *)&rinfo->typelib->data[rinfo->offset];
return blob->storage_type;
}
/**
* g_value_info_get_value:
* @info: a #GIValueInfo
*
* Obtain the enumeration value of the #GIValueInfo.
*
* Returns: the enumeration value. This will always be representable
* as a 32-bit signed or unsigned value. The use of gint64 as the
* return type is to allow both.
*/
gint64
g_value_info_get_value (GIValueInfo *info)
{
GIRealInfo *rinfo = (GIRealInfo *)info;
ValueBlob *blob;
g_return_val_if_fail (info != NULL, -1);
g_return_val_if_fail (GI_IS_VALUE_INFO (info), -1);
blob = (ValueBlob *)&rinfo->typelib->data[rinfo->offset];
if (blob->unsigned_value)
return (gint64)(guint32)blob->value;
else
return (gint64)blob->value;
}

79
girepository/gienuminfo.h Normal file
View File

@ -0,0 +1,79 @@
/* -*- mode: C; c-file-style: "gnu"; indent-tabs-mode: nil; -*-
* GObject introspection: Enum and Enum values
*
* Copyright (C) 2005 Matthias Clasen
* Copyright (C) 2008,2009 Red Hat, Inc.
*
* SPDX-License-Identifier: LGPL-2.1-or-later
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 02111-1307, USA.
*/
#pragma once
#if !defined (__GIREPOSITORY_H_INSIDE__) && !defined (GI_COMPILATION)
#error "Only <girepository.h> can be included directly."
#endif
#include <girepository/gitypes.h>
G_BEGIN_DECLS
/**
* GI_IS_ENUM_INFO
* @info: an info structure
*
* Checks if @info is a #GIEnumInfo.
*/
#define GI_IS_ENUM_INFO(info) \
((g_base_info_get_type((GIBaseInfo*)info) == GI_INFO_TYPE_ENUM) || \
(g_base_info_get_type((GIBaseInfo*)info) == GI_INFO_TYPE_FLAGS))
/**
* GI_IS_VALUE_INFO
* @info: an info structure
*
* Checks if @info is a #GIValueInfo.
*/
#define GI_IS_VALUE_INFO(info) \
(g_base_info_get_type((GIBaseInfo*)info) == GI_INFO_TYPE_VALUE)
GI_AVAILABLE_IN_ALL
gint g_enum_info_get_n_values (GIEnumInfo *info);
GI_AVAILABLE_IN_ALL
GIValueInfo * g_enum_info_get_value (GIEnumInfo *info,
gint n);
GI_AVAILABLE_IN_ALL
gint g_enum_info_get_n_methods (GIEnumInfo *info);
GI_AVAILABLE_IN_ALL
GIFunctionInfo * g_enum_info_get_method (GIEnumInfo *info,
gint n);
GI_AVAILABLE_IN_ALL
GITypeTag g_enum_info_get_storage_type (GIEnumInfo *info);
GI_AVAILABLE_IN_ALL
const gchar * g_enum_info_get_error_domain (GIEnumInfo *info);
GI_AVAILABLE_IN_ALL
gint64 g_value_info_get_value (GIValueInfo *info);
G_END_DECLS

556
girepository/gifieldinfo.c Normal file
View File

@ -0,0 +1,556 @@
/* -*- mode: C; c-file-style: "gnu"; indent-tabs-mode: nil; -*-
* GObject introspection: Field implementation
*
* Copyright (C) 2005 Matthias Clasen
* Copyright (C) 2008,2009 Red Hat, Inc.
*
* SPDX-License-Identifier: LGPL-2.1-or-later
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 02111-1307, USA.
*/
#include "config.h"
#include <glib.h>
#include <girepository/girepository.h>
#include "girepository-private.h"
#include "gitypelib-internal.h"
#include "config.h"
#include "gifieldinfo.h"
/**
* SECTION:gifieldinfo
* @title: GIFieldInfo
* @short_description: Struct representing a struct or union field
*
* A GIFieldInfo struct represents a field of a struct, union, or object.
*
* The GIFieldInfo is fetched by calling g_struct_info_get_field(),
* g_union_info_get_field() or g_object_info_get_field().
*
* A field has a size, type and a struct offset asssociated and a set of flags,
* which are currently #GI_FIELD_IS_READABLE or #GI_FIELD_IS_WRITABLE.
*
* See also: #GIStructInfo, #GIUnionInfo, #GIObjectInfo
*/
/**
* g_field_info_get_flags:
* @info: a #GIFieldInfo
*
* Obtain the flags for this #GIFieldInfo. See #GIFieldInfoFlags for possible
* flag values.
*
* Returns: the flags
*/
GIFieldInfoFlags
g_field_info_get_flags (GIFieldInfo *info)
{
GIFieldInfoFlags flags;
GIRealInfo *rinfo = (GIRealInfo *)info;
FieldBlob *blob;
g_return_val_if_fail (info != NULL, 0);
g_return_val_if_fail (GI_IS_FIELD_INFO (info), 0);
blob = (FieldBlob *)&rinfo->typelib->data[rinfo->offset];
flags = 0;
if (blob->readable)
flags = flags | GI_FIELD_IS_READABLE;
if (blob->writable)
flags = flags | GI_FIELD_IS_WRITABLE;
return flags;
}
/**
* g_field_info_get_size:
* @info: a #GIFieldInfo
*
* Obtain the size in bits of the field member, this is how
* much space you need to allocate to store the field.
*
* Returns: the field size
*/
gint
g_field_info_get_size (GIFieldInfo *info)
{
GIRealInfo *rinfo = (GIRealInfo *)info;
FieldBlob *blob;
g_return_val_if_fail (info != NULL, 0);
g_return_val_if_fail (GI_IS_FIELD_INFO (info), 0);
blob = (FieldBlob *)&rinfo->typelib->data[rinfo->offset];
return blob->bits;
}
/**
* g_field_info_get_offset:
* @info: a #GIFieldInfo
*
* Obtain the offset in bytes of the field member, this is relative
* to the beginning of the struct or union.
*
* Returns: the field offset
*/
gint
g_field_info_get_offset (GIFieldInfo *info)
{
GIRealInfo *rinfo = (GIRealInfo *)info;
FieldBlob *blob;
g_return_val_if_fail (info != NULL, 0);
g_return_val_if_fail (GI_IS_FIELD_INFO (info), 0);
blob = (FieldBlob *)&rinfo->typelib->data[rinfo->offset];
return blob->struct_offset;
}
/**
* g_field_info_get_type:
* @info: a #GIFieldInfo
*
* Obtain the type of a field as a #GITypeInfo.
*
* Returns: (transfer full): the #GITypeInfo. Free the struct by calling
* g_base_info_unref() when done.
*/
GITypeInfo *
g_field_info_get_type (GIFieldInfo *info)
{
GIRealInfo *rinfo = (GIRealInfo *)info;
Header *header = (Header *)rinfo->typelib->data;
FieldBlob *blob;
GIRealInfo *type_info;
g_return_val_if_fail (info != NULL, NULL);
g_return_val_if_fail (GI_IS_FIELD_INFO (info), NULL);
blob = (FieldBlob *)&rinfo->typelib->data[rinfo->offset];
if (blob->has_embedded_type)
{
type_info = (GIRealInfo *) g_info_new (GI_INFO_TYPE_TYPE,
(GIBaseInfo*)info, rinfo->typelib,
rinfo->offset + header->field_blob_size);
type_info->type_is_embedded = TRUE;
}
else
return _g_type_info_new ((GIBaseInfo*)info, rinfo->typelib, rinfo->offset + G_STRUCT_OFFSET (FieldBlob, type));
return (GIBaseInfo*)type_info;
}
/**
* g_field_info_get_field: (skip)
* @field_info: a #GIFieldInfo
* @mem: pointer to a block of memory representing a C structure or union
* @value: a #GIArgument into which to store the value retrieved
*
* Reads a field identified by a #GIFieldInfo from a C structure or
* union. This only handles fields of simple C types. It will fail
* for a field of a composite type like a nested structure or union
* even if that is actually readable.
*
* Returns: %TRUE if reading the field succeeded, otherwise %FALSE
*/
gboolean
g_field_info_get_field (GIFieldInfo *field_info,
gpointer mem,
GIArgument *value)
{
int offset;
GITypeInfo *type_info;
gboolean result = FALSE;
g_return_val_if_fail (field_info != NULL, FALSE);
g_return_val_if_fail (GI_IS_FIELD_INFO (field_info), FALSE);
if ((g_field_info_get_flags (field_info) & GI_FIELD_IS_READABLE) == 0)
return FALSE;
offset = g_field_info_get_offset (field_info);
type_info = g_field_info_get_type (field_info);
if (g_type_info_is_pointer (type_info))
{
value->v_pointer = G_STRUCT_MEMBER (gpointer, mem, offset);
result = TRUE;
}
else
{
switch (g_type_info_get_tag (type_info))
{
case GI_TYPE_TAG_VOID:
g_warning("Field %s: should not be have void type",
g_base_info_get_name ((GIBaseInfo *)field_info));
break;
case GI_TYPE_TAG_BOOLEAN:
value->v_boolean = G_STRUCT_MEMBER (gboolean, mem, offset) != FALSE;
result = TRUE;
break;
case GI_TYPE_TAG_INT8:
case GI_TYPE_TAG_UINT8:
value->v_uint8 = G_STRUCT_MEMBER (guint8, mem, offset);
result = TRUE;
break;
case GI_TYPE_TAG_INT16:
case GI_TYPE_TAG_UINT16:
value->v_uint16 = G_STRUCT_MEMBER (guint16, mem, offset);
result = TRUE;
break;
case GI_TYPE_TAG_INT32:
case GI_TYPE_TAG_UINT32:
case GI_TYPE_TAG_UNICHAR:
value->v_uint32 = G_STRUCT_MEMBER (guint32, mem, offset);
result = TRUE;
break;
case GI_TYPE_TAG_INT64:
case GI_TYPE_TAG_UINT64:
value->v_uint64 = G_STRUCT_MEMBER (guint64, mem, offset);
result = TRUE;
break;
case GI_TYPE_TAG_GTYPE:
value->v_size = G_STRUCT_MEMBER (gsize, mem, offset);
result = TRUE;
break;
case GI_TYPE_TAG_FLOAT:
value->v_float = G_STRUCT_MEMBER (gfloat, mem, offset);
result = TRUE;
break;
case GI_TYPE_TAG_DOUBLE:
value->v_double = G_STRUCT_MEMBER (gdouble, mem, offset);
result = TRUE;
break;
case GI_TYPE_TAG_ARRAY:
/* We don't check the array type and that it is fixed-size,
we trust g-ir-compiler to do the right thing */
value->v_pointer = G_STRUCT_MEMBER_P (mem, offset);
result = TRUE;
break;
case GI_TYPE_TAG_UTF8:
case GI_TYPE_TAG_FILENAME:
case GI_TYPE_TAG_GLIST:
case GI_TYPE_TAG_GSLIST:
case GI_TYPE_TAG_GHASH:
g_warning("Field %s: type %s should have is_pointer set",
g_base_info_get_name ((GIBaseInfo *)field_info),
g_type_tag_to_string (g_type_info_get_tag (type_info)));
break;
case GI_TYPE_TAG_ERROR:
/* Needs to be handled by the language binding directly */
break;
case GI_TYPE_TAG_INTERFACE:
{
GIBaseInfo *interface = g_type_info_get_interface (type_info);
switch (g_base_info_get_type (interface))
{
case GI_INFO_TYPE_STRUCT:
case GI_INFO_TYPE_UNION:
case GI_INFO_TYPE_BOXED:
/* Needs to be handled by the language binding directly */
break;
case GI_INFO_TYPE_OBJECT:
break;
case GI_INFO_TYPE_ENUM:
case GI_INFO_TYPE_FLAGS:
{
/* FIXME: there's a mismatch here between the value->v_int we use
* here and the gint64 result returned from g_value_info_get_value().
* But to switch this to gint64, we'd have to make g_function_info_invoke()
* translate value->v_int64 to the proper ABI for an enum function
* call parameter, which will usually be int, and then fix up language
* bindings.
*/
GITypeTag storage_type = g_enum_info_get_storage_type ((GIEnumInfo *)interface);
switch (storage_type)
{
case GI_TYPE_TAG_INT8:
case GI_TYPE_TAG_UINT8:
value->v_int = (gint)G_STRUCT_MEMBER (guint8, mem, offset);
result = TRUE;
break;
case GI_TYPE_TAG_INT16:
case GI_TYPE_TAG_UINT16:
value->v_int = (gint)G_STRUCT_MEMBER (guint16, mem, offset);
result = TRUE;
break;
case GI_TYPE_TAG_INT32:
case GI_TYPE_TAG_UINT32:
value->v_int = (gint)G_STRUCT_MEMBER (guint32, mem, offset);
result = TRUE;
break;
case GI_TYPE_TAG_INT64:
case GI_TYPE_TAG_UINT64:
value->v_int = (gint)G_STRUCT_MEMBER (guint64, mem, offset);
result = TRUE;
break;
default:
g_warning("Field %s: Unexpected enum storage type %s",
g_base_info_get_name ((GIBaseInfo *)field_info),
g_type_tag_to_string (storage_type));
break;
}
break;
}
case GI_INFO_TYPE_VFUNC:
case GI_INFO_TYPE_CALLBACK:
g_warning("Field %s: Interface type %d should have is_pointer set",
g_base_info_get_name ((GIBaseInfo *)field_info),
g_base_info_get_type (interface));
break;
case GI_INFO_TYPE_INVALID:
case GI_INFO_TYPE_INTERFACE:
case GI_INFO_TYPE_FUNCTION:
case GI_INFO_TYPE_CONSTANT:
case GI_INFO_TYPE_INVALID_0:
case GI_INFO_TYPE_VALUE:
case GI_INFO_TYPE_SIGNAL:
case GI_INFO_TYPE_PROPERTY:
case GI_INFO_TYPE_FIELD:
case GI_INFO_TYPE_ARG:
case GI_INFO_TYPE_TYPE:
case GI_INFO_TYPE_UNRESOLVED:
g_warning("Field %s: Interface type %d not expected",
g_base_info_get_name ((GIBaseInfo *)field_info),
g_base_info_get_type (interface));
break;
default:
break;
}
g_base_info_unref ((GIBaseInfo *)interface);
break;
}
break;
default:
break;
}
}
g_base_info_unref ((GIBaseInfo *)type_info);
return result;
}
/**
* g_field_info_set_field: (skip)
* @field_info: a #GIFieldInfo
* @mem: pointer to a block of memory representing a C structure or union
* @value: a #GIArgument holding the value to store
*
* Writes a field identified by a #GIFieldInfo to a C structure or
* union. This only handles fields of simple C types. It will fail
* for a field of a composite type like a nested structure or union
* even if that is actually writable. Note also that that it will refuse
* to write fields where memory management would by required. A field
* with a type such as 'char *' must be set with a setter function.
*
* Returns: %TRUE if writing the field succeeded, otherwise %FALSE
*/
gboolean
g_field_info_set_field (GIFieldInfo *field_info,
gpointer mem,
const GIArgument *value)
{
int offset;
GITypeInfo *type_info;
gboolean result = FALSE;
g_return_val_if_fail (field_info != NULL, FALSE);
g_return_val_if_fail (GI_IS_FIELD_INFO (field_info), FALSE);
if ((g_field_info_get_flags (field_info) & GI_FIELD_IS_WRITABLE) == 0)
return FALSE;
offset = g_field_info_get_offset (field_info);
type_info = g_field_info_get_type (field_info);
if (!g_type_info_is_pointer (type_info))
{
switch (g_type_info_get_tag (type_info))
{
case GI_TYPE_TAG_VOID:
g_warning("Field %s: should not be have void type",
g_base_info_get_name ((GIBaseInfo *)field_info));
break;
case GI_TYPE_TAG_BOOLEAN:
G_STRUCT_MEMBER (gboolean, mem, offset) = value->v_boolean != FALSE;
result = TRUE;
break;
case GI_TYPE_TAG_INT8:
case GI_TYPE_TAG_UINT8:
G_STRUCT_MEMBER (guint8, mem, offset) = value->v_uint8;
result = TRUE;
break;
case GI_TYPE_TAG_INT16:
case GI_TYPE_TAG_UINT16:
G_STRUCT_MEMBER (guint16, mem, offset) = value->v_uint16;
result = TRUE;
break;
case GI_TYPE_TAG_INT32:
case GI_TYPE_TAG_UINT32:
case GI_TYPE_TAG_UNICHAR:
G_STRUCT_MEMBER (guint32, mem, offset) = value->v_uint32;
result = TRUE;
break;
case GI_TYPE_TAG_INT64:
case GI_TYPE_TAG_UINT64:
G_STRUCT_MEMBER (guint64, mem, offset) = value->v_uint64;
result = TRUE;
break;
case GI_TYPE_TAG_GTYPE:
G_STRUCT_MEMBER (gsize, mem, offset) = value->v_size;
result = TRUE;
break;
case GI_TYPE_TAG_FLOAT:
G_STRUCT_MEMBER (gfloat, mem, offset) = value->v_float;
result = TRUE;
break;
case GI_TYPE_TAG_DOUBLE:
G_STRUCT_MEMBER (gdouble, mem, offset)= value->v_double;
result = TRUE;
break;
case GI_TYPE_TAG_UTF8:
case GI_TYPE_TAG_FILENAME:
case GI_TYPE_TAG_ARRAY:
case GI_TYPE_TAG_GLIST:
case GI_TYPE_TAG_GSLIST:
case GI_TYPE_TAG_GHASH:
g_warning("Field %s: type %s should have is_pointer set",
g_base_info_get_name ((GIBaseInfo *)field_info),
g_type_tag_to_string (g_type_info_get_tag (type_info)));
break;
case GI_TYPE_TAG_ERROR:
/* Needs to be handled by the language binding directly */
break;
case GI_TYPE_TAG_INTERFACE:
{
GIBaseInfo *interface = g_type_info_get_interface (type_info);
switch (g_base_info_get_type (interface))
{
case GI_INFO_TYPE_STRUCT:
case GI_INFO_TYPE_UNION:
case GI_INFO_TYPE_BOXED:
/* Needs to be handled by the language binding directly */
break;
case GI_INFO_TYPE_OBJECT:
break;
case GI_INFO_TYPE_ENUM:
case GI_INFO_TYPE_FLAGS:
{
/* See FIXME above
*/
GITypeTag storage_type = g_enum_info_get_storage_type ((GIEnumInfo *)interface);
switch (storage_type)
{
case GI_TYPE_TAG_INT8:
case GI_TYPE_TAG_UINT8:
G_STRUCT_MEMBER (guint8, mem, offset) = (guint8)value->v_int;
result = TRUE;
break;
case GI_TYPE_TAG_INT16:
case GI_TYPE_TAG_UINT16:
G_STRUCT_MEMBER (guint16, mem, offset) = (guint16)value->v_int;
result = TRUE;
break;
case GI_TYPE_TAG_INT32:
case GI_TYPE_TAG_UINT32:
G_STRUCT_MEMBER (guint32, mem, offset) = (guint32)value->v_int;
result = TRUE;
break;
case GI_TYPE_TAG_INT64:
case GI_TYPE_TAG_UINT64:
G_STRUCT_MEMBER (guint64, mem, offset) = (guint64)value->v_int;
result = TRUE;
break;
default:
g_warning("Field %s: Unexpected enum storage type %s",
g_base_info_get_name ((GIBaseInfo *)field_info),
g_type_tag_to_string (storage_type));
break;
}
break;
}
break;
case GI_INFO_TYPE_VFUNC:
case GI_INFO_TYPE_CALLBACK:
g_warning("Field%s: Interface type %d should have is_pointer set",
g_base_info_get_name ((GIBaseInfo *)field_info),
g_base_info_get_type (interface));
break;
case GI_INFO_TYPE_INVALID:
case GI_INFO_TYPE_INTERFACE:
case GI_INFO_TYPE_FUNCTION:
case GI_INFO_TYPE_CONSTANT:
case GI_INFO_TYPE_INVALID_0:
case GI_INFO_TYPE_VALUE:
case GI_INFO_TYPE_SIGNAL:
case GI_INFO_TYPE_PROPERTY:
case GI_INFO_TYPE_FIELD:
case GI_INFO_TYPE_ARG:
case GI_INFO_TYPE_TYPE:
case GI_INFO_TYPE_UNRESOLVED:
g_warning("Field %s: Interface type %d not expected",
g_base_info_get_name ((GIBaseInfo *)field_info),
g_base_info_get_type (interface));
break;
default:
break;
}
g_base_info_unref ((GIBaseInfo *)interface);
break;
}
break;
default:
break;
}
} else {
switch (g_type_info_get_tag (type_info))
{
case GI_TYPE_TAG_INTERFACE:
{
GIBaseInfo *interface = g_type_info_get_interface (type_info);
switch (g_base_info_get_type (interface))
{
case GI_INFO_TYPE_OBJECT:
case GI_INFO_TYPE_INTERFACE:
G_STRUCT_MEMBER (gpointer, mem, offset) = (gpointer)value->v_pointer;
result = TRUE;
break;
default:
break;
}
g_base_info_unref ((GIBaseInfo *)interface);
}
break;
default:
break;
}
}
g_base_info_unref ((GIBaseInfo *)type_info);
return result;
}

View File

@ -0,0 +1,68 @@
/* -*- mode: C; c-file-style: "gnu"; indent-tabs-mode: nil; -*-
* GObject introspection: Field and Field values
*
* Copyright (C) 2005 Matthias Clasen
* Copyright (C) 2008,2009 Red Hat, Inc.
*
* SPDX-License-Identifier: LGPL-2.1-or-later
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 02111-1307, USA.
*/
#pragma once
#if !defined (__GIREPOSITORY_H_INSIDE__) && !defined (GI_COMPILATION)
#error "Only <girepository.h> can be included directly."
#endif
#include <girepository/gitypes.h>
G_BEGIN_DECLS
/**
* GI_IS_FIELD_INFO
* @info: an info structure
*
* Checks if @info is a #GIFieldInfo.
*
*/
#define GI_IS_FIELD_INFO(info) \
(g_base_info_get_type((GIBaseInfo*)info) == GI_INFO_TYPE_FIELD)
GI_AVAILABLE_IN_ALL
GIFieldInfoFlags g_field_info_get_flags (GIFieldInfo *info);
GI_AVAILABLE_IN_ALL
gint g_field_info_get_size (GIFieldInfo *info);
GI_AVAILABLE_IN_ALL
gint g_field_info_get_offset (GIFieldInfo *info);
GI_AVAILABLE_IN_ALL
GITypeInfo * g_field_info_get_type (GIFieldInfo *info);
GI_AVAILABLE_IN_ALL
gboolean g_field_info_get_field (GIFieldInfo *field_info,
gpointer mem,
GIArgument *value);
GI_AVAILABLE_IN_ALL
gboolean g_field_info_set_field (GIFieldInfo *field_info,
gpointer mem,
const GIArgument *value);
G_END_DECLS

View File

@ -0,0 +1,297 @@
/* -*- mode: C; c-file-style: "gnu"; indent-tabs-mode: nil; -*-
* GObject introspection: Function implementation
*
* Copyright (C) 2005 Matthias Clasen
* Copyright (C) 2008,2009 Red Hat, Inc.
*
* SPDX-License-Identifier: LGPL-2.1-or-later
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 02111-1307, USA.
*/
#include "config.h"
#include <string.h>
#include <glib.h>
#include <girepository/girepository.h>
#include "girepository-private.h"
#include "gitypelib-internal.h"
#include "gifunctioninfo.h"
/**
* SECTION:gifunctioninfo
* @title: GIFunctionInfo
* @short_description: Struct representing a function
*
* GIFunctionInfo represents a function, method or constructor.
*
* To find out what kind of entity a #GIFunctionInfo represents, call
* g_function_info_get_flags().
*
* See also #GICallableInfo for information on how to retreive arguments and
* other metadata.
*/
GIFunctionInfo *
_g_base_info_find_method (GIBaseInfo *base,
guint32 offset,
gint n_methods,
const gchar *name)
{
/* FIXME hash */
GIRealInfo *rinfo = (GIRealInfo*)base;
Header *header = (Header *)rinfo->typelib->data;
gint i;
for (i = 0; i < n_methods; i++)
{
FunctionBlob *fblob = (FunctionBlob *)&rinfo->typelib->data[offset];
const gchar *fname = (const gchar *)&rinfo->typelib->data[fblob->name];
if (strcmp (name, fname) == 0)
return (GIFunctionInfo *) g_info_new (GI_INFO_TYPE_FUNCTION, base,
rinfo->typelib, offset);
offset += header->function_blob_size;
}
return NULL;
}
/**
* g_function_info_get_symbol:
* @info: a #GIFunctionInfo
*
* Obtain the symbol of the function. The symbol is the name of the
* exported function, suitable to be used as an argument to
* g_module_symbol().
*
* Returns: the symbol
*/
const gchar *
g_function_info_get_symbol (GIFunctionInfo *info)
{
GIRealInfo *rinfo;
FunctionBlob *blob;
g_return_val_if_fail (info != NULL, NULL);
g_return_val_if_fail (GI_IS_FUNCTION_INFO (info), NULL);
rinfo = (GIRealInfo *)info;
blob = (FunctionBlob *)&rinfo->typelib->data[rinfo->offset];
return g_typelib_get_string (rinfo->typelib, blob->symbol);
}
/**
* g_function_info_get_flags:
* @info: a #GIFunctionInfo
*
* Obtain the #GIFunctionInfoFlags for the @info.
*
* Returns: the flags
*/
GIFunctionInfoFlags
g_function_info_get_flags (GIFunctionInfo *info)
{
GIFunctionInfoFlags flags;
GIRealInfo *rinfo;
FunctionBlob *blob;
g_return_val_if_fail (info != NULL, -1);
g_return_val_if_fail (GI_IS_FUNCTION_INFO (info), -1);
rinfo = (GIRealInfo *)info;
blob = (FunctionBlob *)&rinfo->typelib->data[rinfo->offset];
flags = 0;
/* Make sure we don't flag Constructors as methods */
if (!blob->constructor && !blob->is_static)
flags = flags | GI_FUNCTION_IS_METHOD;
if (blob->constructor)
flags = flags | GI_FUNCTION_IS_CONSTRUCTOR;
if (blob->getter)
flags = flags | GI_FUNCTION_IS_GETTER;
if (blob->setter)
flags = flags | GI_FUNCTION_IS_SETTER;
if (blob->wraps_vfunc)
flags = flags | GI_FUNCTION_WRAPS_VFUNC;
if (blob->throws)
flags = flags | GI_FUNCTION_THROWS;
return flags;
}
/**
* g_function_info_get_property:
* @info: a #GIFunctionInfo
*
* Obtain the property associated with this #GIFunctionInfo.
* Only #GIFunctionInfo with the flag %GI_FUNCTION_IS_GETTER or
* %GI_FUNCTION_IS_SETTER have a property set. For other cases,
* %NULL will be returned.
*
* Returns: (transfer full): the property or %NULL if not set. Free it with
* g_base_info_unref() when done.
*/
GIPropertyInfo *
g_function_info_get_property (GIFunctionInfo *info)
{
GIRealInfo *rinfo, *container_rinfo;
FunctionBlob *blob;
g_return_val_if_fail (info != NULL, NULL);
g_return_val_if_fail (GI_IS_FUNCTION_INFO (info), NULL);
rinfo = (GIRealInfo *)info;
blob = (FunctionBlob *)&rinfo->typelib->data[rinfo->offset];
container_rinfo = (GIRealInfo *)rinfo->container;
if (container_rinfo->type == GI_INFO_TYPE_INTERFACE)
{
GIInterfaceInfo *container = (GIInterfaceInfo *)rinfo->container;
return g_interface_info_get_property (container, blob->index);
}
else if (container_rinfo->type == GI_INFO_TYPE_OBJECT)
{
GIObjectInfo *container = (GIObjectInfo *)rinfo->container;
return g_object_info_get_property (container, blob->index);
}
else
return NULL;
}
/**
* g_function_info_get_vfunc:
* @info: a #GIFunctionInfo
*
* Obtain the virtual function associated with this #GIFunctionInfo.
* Only #GIFunctionInfo with the flag %GI_FUNCTION_WRAPS_VFUNC has
* a virtual function set. For other cases, %NULL will be returned.
*
* Returns: (transfer full): the virtual function or %NULL if not set.
* Free it by calling g_base_info_unref() when done.
*/
GIVFuncInfo *
g_function_info_get_vfunc (GIFunctionInfo *info)
{
GIRealInfo *rinfo;
FunctionBlob *blob;
GIInterfaceInfo *container;
g_return_val_if_fail (info != NULL, NULL);
g_return_val_if_fail (GI_IS_FUNCTION_INFO (info), NULL);
rinfo = (GIRealInfo *)info;
blob = (FunctionBlob *)&rinfo->typelib->data[rinfo->offset];
container = (GIInterfaceInfo *)rinfo->container;
return g_interface_info_get_vfunc (container, blob->index);
}
/**
* g_invoke_error_quark:
*
* TODO
*
* Returns: TODO
*/
GQuark
g_invoke_error_quark (void)
{
static GQuark quark = 0;
if (quark == 0)
quark = g_quark_from_static_string ("g-invoke-error-quark");
return quark;
}
/**
* g_function_info_invoke: (skip)
* @info: a #GIFunctionInfo describing the function to invoke
* @in_args: (array length=n_in_args): an array of #GIArgument<!-- -->s, one for each in
* parameter of @info. If there are no in parameter, @in_args
* can be %NULL
* @n_in_args: the length of the @in_args array
* @out_args: (array length=n_out_args): an array of #GIArgument<!-- -->s, one for each out
* parameter of @info. If there are no out parameters, @out_args
* may be %NULL
* @n_out_args: the length of the @out_args array
* @return_value: return location for the return value of the
* function.
* @error: return location for detailed error information, or %NULL
*
* Invokes the function described in @info with the given
* arguments. Note that inout parameters must appear in both
* argument lists. This function uses dlsym() to obtain a pointer
* to the function, so the library or shared object containing the
* described function must either be linked to the caller, or must
* have been g_module_symbol()<!-- -->ed before calling this function.
*
* Returns: %TRUE if the function has been invoked, %FALSE if an
* error occurred.
*/
gboolean
g_function_info_invoke (GIFunctionInfo *info,
const GIArgument *in_args,
int n_in_args,
const GIArgument *out_args,
int n_out_args,
GIArgument *return_value,
GError **error)
{
const gchar *symbol;
gpointer func;
gboolean is_method;
gboolean throws;
symbol = g_function_info_get_symbol (info);
if (!g_typelib_symbol (g_base_info_get_typelib((GIBaseInfo *) info),
symbol, &func))
{
g_set_error (error,
G_INVOKE_ERROR,
G_INVOKE_ERROR_SYMBOL_NOT_FOUND,
"Could not locate %s: %s", symbol, g_module_error ());
return FALSE;
}
is_method = (g_function_info_get_flags (info) & GI_FUNCTION_IS_METHOD) != 0
&& (g_function_info_get_flags (info) & GI_FUNCTION_IS_CONSTRUCTOR) == 0;
throws = g_function_info_get_flags (info) & GI_FUNCTION_THROWS;
return g_callable_info_invoke ((GICallableInfo*) info,
func,
in_args,
n_in_args,
out_args,
n_out_args,
return_value,
is_method,
throws,
error);
}

View File

@ -0,0 +1,97 @@
/* -*- mode: C; c-file-style: "gnu"; indent-tabs-mode: nil; -*-
* GObject introspection: Function
*
* Copyright (C) 2005 Matthias Clasen
* Copyright (C) 2008,2009 Red Hat, Inc.
*
* SPDX-License-Identifier: LGPL-2.1-or-later
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 02111-1307, USA.
*/
#pragma once
#if !defined (__GIREPOSITORY_H_INSIDE__) && !defined (GI_COMPILATION)
#error "Only <girepository.h> can be included directly."
#endif
#include <girepository/gitypes.h>
G_BEGIN_DECLS
/**
* GI_IS_FUNCTION_INFO
* @info: an info structure
*
* Checks if @info is a #GIFunctionInfo.
*/
#define GI_IS_FUNCTION_INFO(info) \
(g_base_info_get_type((GIBaseInfo*)info) == GI_INFO_TYPE_FUNCTION)
GI_AVAILABLE_IN_ALL
const gchar * g_function_info_get_symbol (GIFunctionInfo *info);
GI_AVAILABLE_IN_ALL
GIFunctionInfoFlags g_function_info_get_flags (GIFunctionInfo *info);
GI_AVAILABLE_IN_ALL
GIPropertyInfo * g_function_info_get_property (GIFunctionInfo *info);
GI_AVAILABLE_IN_ALL
GIVFuncInfo * g_function_info_get_vfunc (GIFunctionInfo *info);
/**
* G_INVOKE_ERROR:
*
* TODO
*/
#define G_INVOKE_ERROR (g_invoke_error_quark ())
GI_AVAILABLE_IN_ALL
GQuark g_invoke_error_quark (void);
/**
* GInvokeError:
* @G_INVOKE_ERROR_FAILED: invokation failed, unknown error.
* @G_INVOKE_ERROR_SYMBOL_NOT_FOUND: symbol couldn't be found in any of the
* libraries associated with the typelib of the function.
* @G_INVOKE_ERROR_ARGUMENT_MISMATCH: the arguments provided didn't match
* the expected arguments for the functions type signature.
*
* An error occuring while invoking a function via
* g_function_info_invoke().
*/
typedef enum
{
G_INVOKE_ERROR_FAILED,
G_INVOKE_ERROR_SYMBOL_NOT_FOUND,
G_INVOKE_ERROR_ARGUMENT_MISMATCH
} GInvokeError;
GI_AVAILABLE_IN_ALL
gboolean g_function_info_invoke (GIFunctionInfo *info,
const GIArgument *in_args,
int n_in_args,
const GIArgument *out_args,
int n_out_args,
GIArgument *return_value,
GError **error);
G_END_DECLS

View File

@ -0,0 +1,503 @@
/* -*- mode: C; c-file-style: "gnu"; indent-tabs-mode: nil; -*-
* GObject introspection: Interface implementation
*
* Copyright (C) 2005 Matthias Clasen
* Copyright (C) 2008,2009 Red Hat, Inc.
*
* SPDX-License-Identifier: LGPL-2.1-or-later
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 02111-1307, USA.
*/
#include "config.h"
#include <glib.h>
#include <girepository/girepository.h>
#include "girepository-private.h"
#include "gitypelib-internal.h"
#include "giinterfaceinfo.h"
/**
* SECTION:giinterfaceinfo
* @title: GIInterfaceInfo
* @short_description: Struct representing a GInterface
*
* GIInterfaceInfo represents a #GInterface type.
*
* A GInterface has methods, fields, properties, signals, interfaces, constants,
* virtual functions and prerequisites.
*/
/**
* g_interface_info_get_n_prerequisites:
* @info: a #GIInterfaceInfo
*
* Obtain the number of prerequisites for this interface type.
* A prerequisites is another interface that needs to be implemented for
* interface, similar to an base class for GObjects.
*
* Returns: number of prerequisites
*/
gint
g_interface_info_get_n_prerequisites (GIInterfaceInfo *info)
{
GIRealInfo *rinfo = (GIRealInfo *)info;
InterfaceBlob *blob;
g_return_val_if_fail (info != NULL, 0);
g_return_val_if_fail (GI_IS_INTERFACE_INFO (info), 0);
blob = (InterfaceBlob *)&rinfo->typelib->data[rinfo->offset];
return blob->n_prerequisites;
}
/**
* g_interface_info_get_prerequisite:
* @info: a #GIInterfaceInfo
* @n: index of prerequisites to get
*
* Obtain an interface type prerequisites index @n.
*
* Returns: (transfer full): the prerequisites as a #GIBaseInfo. Free the struct by calling
* g_base_info_unref() when done.
*/
GIBaseInfo *
g_interface_info_get_prerequisite (GIInterfaceInfo *info,
gint n)
{
GIRealInfo *rinfo = (GIRealInfo *)info;
InterfaceBlob *blob;
g_return_val_if_fail (info != NULL, NULL);
g_return_val_if_fail (GI_IS_INTERFACE_INFO (info), NULL);
blob = (InterfaceBlob *)&rinfo->typelib->data[rinfo->offset];
return _g_info_from_entry (rinfo->repository,
rinfo->typelib, blob->prerequisites[n]);
}
/**
* g_interface_info_get_n_properties:
* @info: a #GIInterfaceInfo
*
* Obtain the number of properties that this interface type has.
*
* Returns: number of properties
*/
gint
g_interface_info_get_n_properties (GIInterfaceInfo *info)
{
GIRealInfo *rinfo = (GIRealInfo *)info;
InterfaceBlob *blob;
g_return_val_if_fail (info != NULL, 0);
g_return_val_if_fail (GI_IS_INTERFACE_INFO (info), 0);
blob = (InterfaceBlob *)&rinfo->typelib->data[rinfo->offset];
return blob->n_properties;
}
/**
* g_interface_info_get_property:
* @info: a #GIInterfaceInfo
* @n: index of property to get
*
* Obtain an interface type property at index @n.
*
* Returns: (transfer full): the #GIPropertyInfo. Free the struct by calling
* g_base_info_unref() when done.
*/
GIPropertyInfo *
g_interface_info_get_property (GIInterfaceInfo *info,
gint n)
{
gint offset;
GIRealInfo *rinfo = (GIRealInfo *)info;
Header *header;
InterfaceBlob *blob;
g_return_val_if_fail (info != NULL, NULL);
g_return_val_if_fail (GI_IS_INTERFACE_INFO (info), NULL);
header = (Header *)rinfo->typelib->data;
blob = (InterfaceBlob *)&rinfo->typelib->data[rinfo->offset];
offset = rinfo->offset + header->interface_blob_size
+ (blob->n_prerequisites + (blob->n_prerequisites % 2)) * 2
+ n * header->property_blob_size;
return (GIPropertyInfo *) g_info_new (GI_INFO_TYPE_PROPERTY, (GIBaseInfo*)info,
rinfo->typelib, offset);
}
/**
* g_interface_info_get_n_methods:
* @info: a #GIInterfaceInfo
*
* Obtain the number of methods that this interface type has.
*
* Returns: number of methods
*/
gint
g_interface_info_get_n_methods (GIInterfaceInfo *info)
{
GIRealInfo *rinfo = (GIRealInfo *)info;
InterfaceBlob *blob;
g_return_val_if_fail (info != NULL, 0);
g_return_val_if_fail (GI_IS_INTERFACE_INFO (info), 0);
blob = (InterfaceBlob *)&rinfo->typelib->data[rinfo->offset];
return blob->n_methods;
}
/**
* g_interface_info_get_method:
* @info: a #GIInterfaceInfo
* @n: index of method to get
*
* Obtain an interface type method at index @n.
*
* Returns: (transfer full): the #GIFunctionInfo. Free the struct by calling
* g_base_info_unref() when done.
*/
GIFunctionInfo *
g_interface_info_get_method (GIInterfaceInfo *info,
gint n)
{
gint offset;
GIRealInfo *rinfo = (GIRealInfo *)info;
Header *header;
InterfaceBlob *blob;
g_return_val_if_fail (info != NULL, NULL);
g_return_val_if_fail (GI_IS_INTERFACE_INFO (info), NULL);
header = (Header *)rinfo->typelib->data;
blob = (InterfaceBlob *)&rinfo->typelib->data[rinfo->offset];
offset = rinfo->offset + header->interface_blob_size
+ (blob->n_prerequisites + (blob->n_prerequisites % 2)) * 2
+ blob->n_properties * header->property_blob_size
+ n * header->function_blob_size;
return (GIFunctionInfo *) g_info_new (GI_INFO_TYPE_FUNCTION, (GIBaseInfo*)info,
rinfo->typelib, offset);
}
/**
* g_interface_info_find_method:
* @info: a #GIInterfaceInfo
* @name: name of method to obtain
*
* Obtain a method of the interface type given a @name. %NULL will be
* returned if there's no method available with that name.
*
* Returns: (transfer full): the #GIFunctionInfo or %NULL if none found.
* Free the struct by calling g_base_info_unref() when done.
*/
GIFunctionInfo *
g_interface_info_find_method (GIInterfaceInfo *info,
const gchar *name)
{
gint offset;
GIRealInfo *rinfo = (GIRealInfo *)info;
Header *header = (Header *)rinfo->typelib->data;
InterfaceBlob *blob = (InterfaceBlob *)&rinfo->typelib->data[rinfo->offset];
offset = rinfo->offset + header->interface_blob_size
+ (blob->n_prerequisites + (blob->n_prerequisites % 2)) * 2
+ blob->n_properties * header->property_blob_size;
return _g_base_info_find_method ((GIBaseInfo*)info, offset, blob->n_methods, name);
}
/**
* g_interface_info_get_n_signals:
* @info: a #GIInterfaceInfo
*
* Obtain the number of signals that this interface type has.
*
* Returns: number of signals
*/
gint
g_interface_info_get_n_signals (GIInterfaceInfo *info)
{
GIRealInfo *rinfo = (GIRealInfo *)info;
InterfaceBlob *blob;
g_return_val_if_fail (info != NULL, 0);
g_return_val_if_fail (GI_IS_INTERFACE_INFO (info), 0);
blob = (InterfaceBlob *)&rinfo->typelib->data[rinfo->offset];
return blob->n_signals;
}
/**
* g_interface_info_get_signal:
* @info: a #GIInterfaceInfo
* @n: index of signal to get
*
* Obtain an interface type signal at index @n.
*
* Returns: (transfer full): the #GISignalInfo. Free the struct by calling
* g_base_info_unref() when done.
*/
GISignalInfo *
g_interface_info_get_signal (GIInterfaceInfo *info,
gint n)
{
gint offset;
GIRealInfo *rinfo = (GIRealInfo *)info;
Header *header;
InterfaceBlob *blob;
g_return_val_if_fail (info != NULL, NULL);
g_return_val_if_fail (GI_IS_INTERFACE_INFO (info), NULL);
header = (Header *)rinfo->typelib->data;
blob = (InterfaceBlob *)&rinfo->typelib->data[rinfo->offset];
offset = rinfo->offset + header->interface_blob_size
+ (blob->n_prerequisites + (blob->n_prerequisites % 2)) * 2
+ blob->n_properties * header->property_blob_size
+ blob->n_methods * header->function_blob_size
+ n * header->signal_blob_size;
return (GISignalInfo *) g_info_new (GI_INFO_TYPE_SIGNAL, (GIBaseInfo*)info,
rinfo->typelib, offset);
}
/**
* g_interface_info_find_signal:
* @info: a #GIInterfaceInfo
* @name: Name of signal
*
* TODO
*
* Returns: (transfer full): Info for the signal with name @name in @info, or
* %NULL on failure.
* Since: 1.34
*/
GISignalInfo *
g_interface_info_find_signal (GIInterfaceInfo *info,
const gchar *name)
{
gint n_signals;
gint i;
n_signals = g_interface_info_get_n_signals (info);
for (i = 0; i < n_signals; i++)
{
GISignalInfo *siginfo = g_interface_info_get_signal (info, i);
if (g_strcmp0 (g_base_info_get_name (siginfo), name) != 0)
{
g_base_info_unref ((GIBaseInfo*)siginfo);
continue;
}
return siginfo;
}
return NULL;
}
/**
* g_interface_info_get_n_vfuncs:
* @info: a #GIInterfaceInfo
*
* Obtain the number of virtual functions that this interface type has.
*
* Returns: number of virtual functions
*/
gint
g_interface_info_get_n_vfuncs (GIInterfaceInfo *info)
{
GIRealInfo *rinfo = (GIRealInfo *)info;
InterfaceBlob *blob;
g_return_val_if_fail (info != NULL, 0);
g_return_val_if_fail (GI_IS_INTERFACE_INFO (info), 0);
blob = (InterfaceBlob *)&rinfo->typelib->data[rinfo->offset];
return blob->n_vfuncs;
}
/**
* g_interface_info_get_vfunc:
* @info: a #GIInterfaceInfo
* @n: index of virtual function to get
*
* Obtain an interface type virtual function at index @n.
*
* Returns: (transfer full): the #GIVFuncInfo. Free the struct by calling
* g_base_info_unref() when done.
*/
GIVFuncInfo *
g_interface_info_get_vfunc (GIInterfaceInfo *info,
gint n)
{
gint offset;
GIRealInfo *rinfo = (GIRealInfo *)info;
Header *header;
InterfaceBlob *blob;
g_return_val_if_fail (info != NULL, NULL);
g_return_val_if_fail (GI_IS_INTERFACE_INFO (info), NULL);
header = (Header *)rinfo->typelib->data;
blob = (InterfaceBlob *)&rinfo->typelib->data[rinfo->offset];
offset = rinfo->offset + header->interface_blob_size
+ (blob->n_prerequisites + (blob->n_prerequisites % 2)) * 2
+ blob->n_properties * header->property_blob_size
+ blob->n_methods * header->function_blob_size
+ blob->n_signals * header->signal_blob_size
+ n * header->vfunc_blob_size;
return (GIVFuncInfo *) g_info_new (GI_INFO_TYPE_VFUNC, (GIBaseInfo*)info,
rinfo->typelib, offset);
}
/**
* g_interface_info_find_vfunc:
* @info: a #GIInterfaceInfo
* @name: The name of a virtual function to find.
*
* Locate a virtual function slot with name @name. See the documentation
* for g_object_info_find_vfunc() for more information on virtuals.
*
* Returns: (transfer full): the #GIVFuncInfo, or %NULL. Free it with
* g_base_info_unref() when done.
*/
GIVFuncInfo *
g_interface_info_find_vfunc (GIInterfaceInfo *info,
const gchar *name)
{
gint offset;
GIRealInfo *rinfo = (GIRealInfo *)info;
Header *header;
InterfaceBlob *blob;
g_return_val_if_fail (info != NULL, NULL);
g_return_val_if_fail (GI_IS_INTERFACE_INFO (info), NULL);
header = (Header *)rinfo->typelib->data;
blob = (InterfaceBlob *)&rinfo->typelib->data[rinfo->offset];
offset = rinfo->offset + header->interface_blob_size
+ (blob->n_prerequisites + blob->n_prerequisites % 2) * 2
+ blob->n_properties * header->property_blob_size
+ blob->n_methods * header->function_blob_size
+ blob->n_signals * header->signal_blob_size;
return _g_base_info_find_vfunc (rinfo, offset, blob->n_vfuncs, name);
}
/**
* g_interface_info_get_n_constants:
* @info: a #GIInterfaceInfo
*
* Obtain the number of constants that this interface type has.
*
* Returns: number of constants
*/
gint
g_interface_info_get_n_constants (GIInterfaceInfo *info)
{
GIRealInfo *rinfo = (GIRealInfo *)info;
InterfaceBlob *blob;
g_return_val_if_fail (info != NULL, 0);
g_return_val_if_fail (GI_IS_INTERFACE_INFO (info), 0);
blob = (InterfaceBlob *)&rinfo->typelib->data[rinfo->offset];
return blob->n_constants;
}
/**
* g_interface_info_get_constant:
* @info: a #GIInterfaceInfo
* @n: index of constant to get
*
* Obtain an interface type constant at index @n.
*
* Returns: (transfer full): the #GIConstantInfo. Free the struct by calling
* g_base_info_unref() when done.
*/
GIConstantInfo *
g_interface_info_get_constant (GIInterfaceInfo *info,
gint n)
{
gint offset;
GIRealInfo *rinfo = (GIRealInfo *)info;
Header *header;
InterfaceBlob *blob;
g_return_val_if_fail (info != NULL, NULL);
g_return_val_if_fail (GI_IS_INTERFACE_INFO (info), NULL);
header = (Header *)rinfo->typelib->data;
blob = (InterfaceBlob *)&rinfo->typelib->data[rinfo->offset];
offset = rinfo->offset + header->interface_blob_size
+ (blob->n_prerequisites + (blob->n_prerequisites % 2)) * 2
+ blob->n_properties * header->property_blob_size
+ blob->n_methods * header->function_blob_size
+ blob->n_signals * header->signal_blob_size
+ blob->n_vfuncs * header->vfunc_blob_size
+ n * header->constant_blob_size;
return (GIConstantInfo *) g_info_new (GI_INFO_TYPE_CONSTANT, (GIBaseInfo*)info,
rinfo->typelib, offset);
}
/**
* g_interface_info_get_iface_struct:
* @info: a #GIInterfaceInfo
*
* Returns the layout C structure associated with this #GInterface.
*
* Returns: (transfer full): the #GIStructInfo or %NULL. Free it with
* g_base_info_unref() when done.
*/
GIStructInfo *
g_interface_info_get_iface_struct (GIInterfaceInfo *info)
{
GIRealInfo *rinfo = (GIRealInfo *)info;
InterfaceBlob *blob;
g_return_val_if_fail (info != NULL, NULL);
g_return_val_if_fail (GI_IS_INTERFACE_INFO (info), NULL);
blob = (InterfaceBlob *)&rinfo->typelib->data[rinfo->offset];
if (blob->gtype_struct)
return (GIStructInfo *) _g_info_from_entry (rinfo->repository,
rinfo->typelib, blob->gtype_struct);
else
return NULL;
}

Some files were not shown because too many files have changed in this diff Show More