NHI Number

Last updated

The National Health Index (NHI) number is the unique person identifier used within the New Zealand health system. It is technically not a number but rather an alphanumeric identifier consisting of 7 characters, with three letters and four numbers. It is often referred to as the NHI, although care must be taken when using this abbreviated term, because the NHI can also refer to the national collection of health care user demographic data (of which the NHI Number is the unique identifier).

Contents

The NHI Number, as part of the NHI was established in 1993. [1]

Usage

Primarily the NHI is used to identify individuals uniquely within the New Zealand health system, [1] [2] especially in electronic systems. An example of this is its use to alert health care providers using the Medical Warnings System (MWS) of risks associated with medical decision-making for specific patients.

Format

NHI number are in the format LLLNNNC, where L is a letter (excluding I and O), and N is a numeral, and C is a numeric check digit. (e.g. ABC1235) The assignment of the first characters is arbitrary and bears no relationship to the individual to whom it is assigned. The NHI Number is most often represented with the alphabetic characters upper case. The format provides for 13,824,000 unique NHI numbers, although 1,256,727 of these combinations cannot generate a check digit.

NHI Numbers are often referred to as being valid or invalid. Any NHI Number that does not fit the correct format or that has an incorrect check digit is referred to as invalid. Usually reference to an NHI Number being valid or not does not indicate that it is correctly associated with the right individual. As the identifier is arbitrary there is no way to do this based solely on the identifier itself.

Open-source packages are available to check the validity of NHI numbers:

The existing range is expected to be exhausted after 2025. In 2019, a revised standard introduced a new format of LLLNNLX, where X is a letter check digit (e.g. ABC12DV). The new format will be available for allocation from July 2022, and will provide an additional 33,177,600 unique NHI numbers. The two formats will co-exist indefinitely, and all administrative and clinical systems will need to support them both.

All NHI numbers starting with Z are reserved for test purposes. [3]

Duplicates

When it has been identified that an individual has been assigned more than one NHI Number, one is deemed to be the primary identifier. This is usually done by ranking all assigned numbers in alpha-numeric order and choosing the first one as the primary.

All other NHI Numbers for the individual within the NHI are then linked to the primary one.

Check digit

There are two variants of the check digit algorithm to allow for the old NHI number format having a numeric check digit while the new format has an alphabetic check character. The change to using an alphabetic check digit was intended to resolve a previously identified weakness where simple single character transcription errors are not always detected by the old check digit scheme. [4] The new algorithm implementation however creates a situation where collisions in the check-digit are much more likely to happen for simple single character transcription errors. [5]

For the new format, each alphabetic character is given a numeric value equal to its ordinal position within a version of the alphabet that omits the letters I and O. The ordinal range is 1–24. This gives A=1 and Z=24, for example. Each numeric character is used with its face value 0–9 in the calculation.

Each character’s equivalent numeric value is then multiplied by its reverse ordinal position within the NHI number. The first value is multiplied by 7, the second by 6, the third by 5, the fourth by 4, the fifth by 3 and the sixth by 2. The sum of the six products is calculated. The calculated sum modulo 23 is subtracted from 23 to give an index number. If the index number is zero, then the NHI number is invalid and cannot be used.

For the old format, the NHI Number contains a check digit. The algorithm for generating the digit is described below:

Each alpha character is given a numeric representation equivalent to its ordinal position within the alphabet, starting at A through to Z. The letters I and O are omitted making the ordinal range 1 - 24.

Each alpha character's numeric representation is multiplied by the inverse of its ordinal position within the NHI Number. The first value is multiplied by 7, the second by 6 and so on.

The first 3 numeric characters are multiplied by the inverse of their ordinal position also.

The sum of these multiplications modulus 11 subtracted from 11 is taken as the check digit (a result of 10 is translated to 0).

This scheme is similar to the ISBN check digit scheme.

Excel formulae to validate NHI numbers in old, new, and both formats

These formulae require Excel version 2010 or later (or equivalent). The formulae assume input is alphanumeric and uppercase. Checks are made to confirm the string is 7 characters, letters "I" and "O" are not present, and that alphanumeric characters are in the correct places. The formulae return TRUE for a valid NHI or FALSE if not.

Old format

=AND(IF(LEN(A2)=7,TRUE,FALSE),NOT(ISNUMBER(FIND("I",A2))),NOT(ISNUMBER(FIND("O",A2))),ISTEXT(LEFT(A2,3)),IF(ISNUMBER(VALUE(RIGHT(A2,4))),11-MOD(7*IF(ISERR(MID(A2,1,1)*1),IF(CODE(MID(A2,1,1))>79,CODE(MID(A2,1,1))-66,IF(CODE(MID(A2,1,1))>72,CODE(MID(A2,1,1))-65,CODE(MID(A2,1,1))-64)),MID(A2,1,1))+6*IF(ISERR(MID(A2,2,1)*1),IF(CODE(MID(A2,2,1))>79,CODE(MID(A2,2,1))-66,IF(CODE(MID(A2,2,1))>72,CODE(MID(A2,2,1))-65,CODE(MID(A2,2,1))-64)),MID(A2,2,1))+5*IF(ISERR(MID(A2,3,1)*1),IF(CODE(MID(A2,3,1))>79,CODE(MID(A2,3,1))-66,IF(CODE(MID(A2,3,1))>72,CODE(MID(A2,3,1))-65,CODE(MID(A2,3,1))-64)),MID(A2,3,1))+4*IF(NOT(ISERR(MID(A2,4,1)*1)),MID(A2,4,1))+3*IF(NOT(ISERR(MID(A2,5,1)*1)),MID(A2,5,1))+2*IF(NOT(ISERR(MID(A2,6,1)*1)),MID(A2,6,1)),11)=IF(NOT(ISERR(MID(A2,7,1)*1)),MID(A2,7,1)*1)))

New format

=AND(LEN(A2)=7,NOT(ISNUMBER(FIND("I",A2))),NOT(ISNUMBER(FIND("O",A2))),NOT(ISERR(MID(A2,4,2)*1)),23-MOD(7*IF(ISERR(MID(A2,1,1)*1),IF(CODE(MID(A2,1,1))>79,CODE(MID(A2,1,1))-66,IF(CODE(MID(A2,1,1))>72,CODE(MID(A2,1,1))-65,CODE(MID(A2,1,1))-64)),MID(A2,1,1))+6*IF(ISERR(MID(A2,2,1)*1),IF(CODE(MID(A2,2,1))>79,CODE(MID(A2,2,1))-66,IF(CODE(MID(A2,2,1))>72,CODE(MID(A2,2,1))-65,CODE(MID(A2,2,1))-64)),MID(A2,2,1))+5*IF(ISERR(MID(A2,3,1)*1),IF(CODE(MID(A2,3,1))>79,CODE(MID(A2,3,1))-66,IF(CODE(MID(A2,3,1))>72,CODE(MID(A2,3,1))-65,CODE(MID(A2,3,1))-64)),MID(A2,3,1))+4*IF(NOT(ISERR(MID(A2,4,1)*1)),MID(A2,4,1))+3*IF(NOT(ISERR(MID(A2,5,1)*1)),MID(A2,5,1))+2*IF(ISERR(MID(A2,6,1)*1),IF(CODE(MID(A2,6,1))>79,CODE(MID(A2,6,1))-66,IF(CODE(MID(A2,6,1))>72,CODE(MID(A2,6,1))-65,CODE(MID(A2,6,1))-64)),MID(A2,6,1)),23)=IF(ISERR(MID(A2,7,1)*1),IF(CODE(MID(A2,7,1))>79,CODE(MID(A2,7,1))-66,IF(CODE(MID(A2,7,1))>72,CODE(MID(A2,7,1))-65,CODE(MID(A2,7,1))-64)),MID(A2,7,1)))

Both formats

=AND(LEN(A2)=7,NOT(ISNUMBER(FIND("I",A2))),NOT(ISNUMBER(FIND("O",A2))),NOT(ISERR(MID(A2,4,2)*1)),IF(ISERR(MID(A2,6,1)*1),23-MOD(7*IF(ISERR(MID(A2,1,1)*1),IF(CODE(MID(A2,1,1))>79,CODE(MID(A2,1,1))-66,IF(CODE(MID(A2,1,1))>72,CODE(MID(A2,1,1))-65,CODE(MID(A2,1,1))-64)),MID(A2,1,1))+6*IF(ISERR(MID(A2,2,1)*1),IF(CODE(MID(A2,2,1))>79,CODE(MID(A2,2,1))-66,IF(CODE(MID(A2,2,1))>72,CODE(MID(A2,2,1))-65,CODE(MID(A2,2,1))-64)),MID(A2,2,1))+5*IF(ISERR(MID(A2,3,1)*1),IF(CODE(MID(A2,3,1))>79,CODE(MID(A2,3,1))-66,IF(CODE(MID(A2,3,1))>72,CODE(MID(A2,3,1))-65,CODE(MID(A2,3,1))-64)),MID(A2,3,1))+4*IF(NOT(ISERR(MID(A2,4,1)*1)),MID(A2,4,1))+3*IF(NOT(ISERR(MID(A2,5,1)*1)),MID(A2,5,1))+2*IF(ISERR(MID(A2,6,1)*1),IF(CODE(MID(A2,6,1))>79,CODE(MID(A2,6,1))-66,IF(CODE(MID(A2,6,1))>72,CODE(MID(A2,6,1))-65,CODE(MID(A2,6,1))-64)),MID(A2,6,1)),23)=IF(ISERR(MID(A2,7,1)*1),IF(CODE(MID(A2,7,1))>79,CODE(MID(A2,7,1))-66,IF(CODE(MID(A2,7,1))>72,CODE(MID(A2,7,1))-65,CODE(MID(A2,7,1))-64)),MID(A2,7,1)),11-MOD(7*IF(ISERR(MID(A2,1,1)*1),IF(CODE(MID(A2,1,1))>79,CODE(MID(A2,1,1))-66,IF(CODE(MID(A2,1,1))>72,CODE(MID(A2,1,1))-65,CODE(MID(A2,1,1))-64)),MID(A2,1,1))+6*IF(ISERR(MID(A2,2,1)*1),IF(CODE(MID(A2,2,1))>79,CODE(MID(A2,2,1))-66,IF(CODE(MID(A2,2,1))>72,CODE(MID(A2,2,1))-65,CODE(MID(A2,2,1))-64)),MID(A2,2,1))+5*IF(ISERR(MID(A2,3,1)*1),IF(CODE(MID(A2,3,1))>79,CODE(MID(A2,3,1))-66,IF(CODE(MID(A2,3,1))>72,CODE(MID(A2,3,1))-65,CODE(MID(A2,3,1))-64)),MID(A2,3,1))+4*IF(NOT(ISERR(MID(A2,4,1)*1)),MID(A2,4,1))+3*IF(NOT(ISERR(MID(A2,5,1)*1)),MID(A2,5,1))+2*IF(NOT(ISERR(MID(A2,6,1)*1)),MID(A2,6,1)),11)=IF(NOT(ISERR(MID(A2,7,1)*1)),MID(A2,7,1)*1)))

Related Research Articles

In mathematics and computing, the hexadecimal numeral system is a positional numeral system that represents numbers using a radix (base) of sixteen. Unlike the decimal system representing numbers using ten symbols, hexadecimal uses sixteen distinct symbols, most often the symbols "0"–"9" to represent values 0 to 9, and "A"–"F" to represent values from ten to fifteen.

<span class="mw-page-title-main">ISBN</span> Unique numeric book identifier since 1970

The International Standard Book Number (ISBN) is a numeric commercial book identifier that is intended to be unique. Publishers purchase or receive ISBNs from an affiliate of the International ISBN Agency.

<span class="mw-page-title-main">International Bank Account Number</span> Alphanumeric code that uniquely identifies a bank account in any participating country

The International Bank Account Number (IBAN) is an internationally agreed upon system of identifying bank accounts across national borders to facilitate the communication and processing of cross border transactions with a reduced risk of transcription errors. An IBAN uniquely identifies the account of a customer at a financial institution. It was originally adopted by the European Committee for Banking Standards (ECBS) and since 1997 as the international standard ISO 13616 under the International Organization for Standardization (ISO). The current version is ISO 13616:2020, which indicates the Society for Worldwide Interbank Financial Telecommunication (SWIFT) as the formal registrar. Initially developed to facilitate payments within the European Union, it has been implemented by most European countries and numerous countries in other parts of the world, mainly in the Middle East and the Caribbean. As of July 2023, 86 countries were using the IBAN numbering system.

A computer number format is the internal representation of numeric values in digital device hardware and software, such as in programmable computers and calculators. Numerical values are stored as groupings of bits, such as bytes and words. The encoding between numerical values and bit patterns is chosen for convenience of the operation of the computer; the encoding used by the computer's instruction set generally requires conversion for external use, such as for printing and display. Different types of processors may have different internal representations of numerical values and different conventions are used for integer and real numbers. Most calculations are carried out with number formats that fit into a processor register, but some software systems allow representation of arbitrarily large numbers using multiple words of memory.

<span class="mw-page-title-main">Postal code</span> Series of letters and digits for sorting mail

A postal code is a series of letters or digits or both, sometimes including spaces or punctuation, included in a postal address for the purpose of sorting mail.

<span class="mw-page-title-main">Vehicle identification number</span> System for identifying vehicles

A vehicle identification number (VIN) (also called a chassis number or frame number) is a unique code, including a serial number, used by the automotive industry to identify individual motor vehicles, towed vehicles, motorcycles, scooters and mopeds, as defined by the International Organization for Standardization in ISO 3779 (content and structure) and ISO 4030 (location and attachment).

An International Securities Identification Number (ISIN) is a code that uniquely identifies a security globally for the purposes of facilitating clearing, reporting and settlement of trades. Its structure is defined in ISO 6166. The ISIN code is a 12-character alphanumeric code that serves for uniform identification of a security through normalization of the assigned National Number, where one exists, at trading and settlement.

A check digit is a form of redundancy check used for error detection on identification numbers, such as bank account numbers, which are used in an application where they will at least sometimes be input manually. It is analogous to a binary parity bit used to check for errors in computer-generated data. It consists of one or more digits computed by an algorithm from the other digits in the sequence input.

The Luhn algorithm or Luhn formula, also known as the "modulus 10" or "mod 10" algorithm, named after its creator, IBM scientist Hans Peter Luhn, is a simple check digit formula used to validate a variety of identification numbers.

In the United States, an ABA routing transit number is a nine-digit code printed on the bottom of checks to identify the financial institution on which it was drawn. The American Bankers Association (ABA) developed the system in 1910 to facilitate the sorting, bundling, and delivering of paper checks to the drawer's bank for debit to the drawer's account.

In computer science, arbitrary-precision arithmetic, also called bignum arithmetic, multiple-precision arithmetic, or sometimes infinite-precision arithmetic, indicates that calculations are performed on numbers whose digits of precision are limited only by the available memory of the host system. This contrasts with the faster fixed-precision arithmetic found in most arithmetic logic unit (ALU) hardware, which typically offers between 8 and 64 bits of precision.

SEDOL stands for Stock Exchange Daily Official List, a list of security identifiers used in the United Kingdom and Ireland for clearing purposes. The numbers are assigned by the London Stock Exchange, on request by the security issuer. SEDOLs serve as the National Securities Identifying Number for all securities issued in the United Kingdom and are therefore part of the security's ISIN as well. The SEDOL Masterfile (SMF) provides reference data on millions of global multi-asset securities each uniquely identified at the market level using a universal SEDOL code.

A CUSIP is a nine-character numeric or alphanumeric code that uniquely identifies a North American financial security for the purposes of facilitating clearing and settlement of trades. All CUSIP identifiers are fungible, which means that a unique CUSIP identifier for each individual security stays the same, regardless of the exchange where the shares were purchased or venue on which the shares were traded. CUSIP was adopted as an American national standard by the Accredited Standards Committee X9 and is designated ANSI X9.6. CUSIP was re-approved as an ANSI standard in December 2020. The acronym derives from Committee on Uniform Security Identification Procedures.

<span class="mw-page-title-main">ISO 6346</span> International standard covering the coding, identification and marking of shipping containers


ISO 6346 is an international standard covering the coding, identification and marking of intermodal (shipping) containers used within containerized intermodal freight transport by the International Organization for Standardization (ISO). The standard establishes a visual identification system for every container that includes a unique serial number, the owner, a country code, a size, type and equipment category as well as any operational marks. The register of container owners is managed by the International Container Bureau (BIC).

<span class="mw-page-title-main">International Standard Music Number</span> Identifier for printed music developed by ISO

The International Standard Music Number or ISMN is a thirteen-character alphanumeric identifier for printed music developed by ISO.

A national identification number, national identity number, or national insurance number or JMBG/EMBG is used by the governments of many countries as a means of tracking their citizens, permanent residents, and temporary residents for the purposes of work, taxation, government benefits, health care, and other governmentally-related functions.

<span class="mw-page-title-main">Personal Public Service Number</span> ID number for individuals in Ireland

The Personal Public Service Number is a unique identifier of individuals in Ireland. It is issued by the Client Identity Services section of the Department of Social Protection, on behalf of Ireland's Minister for Social Protection.

A DEA number is an identifier assigned to a health care provider by the United States Drug Enforcement Administration allowing them to write prescriptions for controlled substances.

The Luhn mod N algorithm is an extension to the Luhn algorithm that allows it to work with sequences of values in any even-numbered base. This can be useful when a check digit is required to validate an identification string composed of letters, a combination of letters and digits or any arbitrary set of N characters where N is divisible by 2.

References

  1. 1 2 New Zealand Health Information Service. National Health Index (NHI). Retrieved 13 June 2007.
  2. New Zealand Health Information Service NHI Number. Retrieved 13 June 2007.
  3. Ministry of Health. "HISO 10046:2019 Consumer Health Identity Standard". Ministry of Health. Retrieved 15 August 2020.
  4. MacRae, Jayden (November 2015). "Evaluating a weakness of the National Health Index identifier check-digit to transcription errors in data entry" (PDF). Health Informatics New Zealand Conference 2015. Retrieved 13 May 2021.
  5. MacRae, Jayden. "Check-Digit Collisions in the New NHI Implementation". DataCraft Analytics. DataCraft Analytics. Retrieved 5 March 2023.