Home | Trees | Indices | Help |
|
---|
|
the arabic chars contains all arabic letters, a sub class of unicode,
|
|||
|
|||
is letter functions | |||
---|---|---|---|
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
general letter functions | |||
integer; |
|
||
unicode; |
|
||
unicode; |
|
||
Has letter functions | |||
|
|||
word and text functions | |||
|
|||
|
|||
Boolean |
|
||
Boolean |
|
||
Strip functions | |||
unicode. |
|
||
unicode. |
|
||
unicode. |
|
||
unicode. |
|
||
unicode. |
|
|
|||
COMMA = u'\u060C'
|
|||
SEMICOLON = u'\u061B'
|
|||
QUESTION = u'\u061F'
|
|||
HAMZA = u'\u0621'
|
|||
ALEF_MADDA = u'\u0622'
|
|||
ALEF_HAMZA_ABOVE = u'\u0623'
|
|||
WAW_HAMZA = u'\u0624'
|
|||
ALEF_HAMZA_BELOW = u'\u0625'
|
|||
YEH_HAMZA = u'\u0626'
|
|||
ALEF = u'\u0627'
|
|||
BEH = u'\u0628'
|
|||
TEH_MARBUTA = u'\u0629'
|
|||
TEH = u'\u062a'
|
|||
THEH = u'\u062b'
|
|||
JEEM = u'\u062c'
|
|||
HAH = u'\u062d'
|
|||
KHAH = u'\u062e'
|
|||
DAL = u'\u062f'
|
|||
THAL = u'\u0630'
|
|||
REH = u'\u0631'
|
|||
ZAIN = u'\u0632'
|
|||
SEEN = u'\u0633'
|
|||
SHEEN = u'\u0634'
|
|||
SAD = u'\u0635'
|
|||
DAD = u'\u0636'
|
|||
TAH = u'\u0637'
|
|||
ZAH = u'\u0638'
|
|||
AIN = u'\u0639'
|
|||
GHAIN = u'\u063a'
|
|||
TATWEEL = u'\u0640'
|
|||
FEH = u'\u0641'
|
|||
QAF = u'\u0642'
|
|||
KAF = u'\u0643'
|
|||
LAM = u'\u0644'
|
|||
MEEM = u'\u0645'
|
|||
NOON = u'\u0646'
|
|||
HEH = u'\u0647'
|
|||
WAW = u'\u0648'
|
|||
ALEF_MAKSURA = u'\u0649'
|
|||
YEH = u'\u064a'
|
|||
MADDA_ABOVE = u'\u0653'
|
|||
HAMZA_ABOVE = u'\u0654'
|
|||
HAMZA_BELOW = u'\u0655'
|
|||
ZERO = u'\u0660'
|
|||
ONE = u'\u0661'
|
|||
TWO = u'\u0662'
|
|||
THREE = u'\u0663'
|
|||
FOUR = u'\u0664'
|
|||
FIVE = u'\u0665'
|
|||
SIX = u'\u0666'
|
|||
SEVEN = u'\u0667'
|
|||
EIGHT = u'\u0668'
|
|||
NINE = u'\u0669'
|
|||
PERCENT = u'\u066a'
|
|||
DECIMAL = u'\u066b'
|
|||
THOUSANDS = u'\u066c'
|
|||
STAR = u'\u066d'
|
|||
MINI_ALEF = u'\u0670'
|
|||
ALEF_WASLA = u'\u0671'
|
|||
FULL_STOP = u'\u06d4'
|
|||
BYTE_ORDER_MARK = u'\ufeff'
|
|||
FATHATAN = u'\u064b'
|
|||
DAMMATAN = u'\u064c'
|
|||
KASRATAN = u'\u064d'
|
|||
FATHA = u'\u064e'
|
|||
DAMMA = u'\u064f'
|
|||
KASRA = u'\u0650'
|
|||
SHADDA = u'\u0651'
|
|||
SUKUN = u'\u0652'
|
|||
SMALL_ALEF = u"\u0670"
|
|||
SMALL_WAW = u"\u06E5"
|
|||
SMALL_YEH = u"\u06E6"
|
|||
LAM_ALEF = u'\ufefb'
|
|||
LAM_ALEF_HAMZA_ABOVE = u'\ufef7'
|
|||
LAM_ALEF_HAMZA_BELOW = u'\ufef9'
|
|||
LAM_ALEF_MADDA_ABOVE = u'\ufef5'
|
|||
simple_LAM_ALEF = u'\u0644\u0627'
|
|||
simple_LAM_ALEF_HAMZA_ABOVE = u'\u0644\u0623'
|
|||
simple_LAM_ALEF_HAMZA_BELOW = u'\u0644\u0625'
|
|||
simple_LAM_ALEF_MADDA_ABOVE = u'\u0644\u0622'
|
|||
LETTERS = u''.join([ALEF, BEH, TEH, TEH_MARBUTA, THEH, JEEM, H
|
|||
TASHKEEL = FATHATAN, DAMMATAN, KASRATAN, FATHA, DAMMA, KASRA,
|
|||
HARAKAT = FATHATAN, DAMMATAN, KASRATAN, FATHA, DAMMA, KASRA, S
|
|||
SHORTHARAKAT = FATHA, DAMMA, KASRA, SUKUN
|
|||
TANWIN = FATHATAN, DAMMATAN, KASRATAN
|
|||
LIGUATURES = LAM_ALEF, LAM_ALEF_HAMZA_ABOVE, LAM_ALEF_HAMZA_BE
|
|||
HAMZAT = HAMZA, WAW_HAMZA, YEH_HAMZA, HAMZA_ABOVE, HAMZA_BELOW
|
|||
ALEFAT = ALEF, ALEF_MADDA, ALEF_HAMZA_ABOVE, ALEF_HAMZA_BELOW,
|
|||
WEAK = ALEF, WAW, YEH, ALEF_MAKSURA
|
|||
YEHLIKE = YEH, YEH_HAMZA, ALEF_MAKSURA, SMALL_YEH
|
|||
WAWLIKE = WAW, WAW_HAMZA, SMALL_WAW
|
|||
TEHLIKE = TEH, TEH_MARBUTA
|
|||
SMALL = SAMLL_ALEF, SMALL_WAW, SMALL_YEH
|
|||
MOON = HAMZA, ALEF_MADDA, ALEF_HAMZA_ABOVE, ALEF_HAMZA_BELOW,
|
|||
SUN = TEH, THEH, DAL, THAL, REH, ZAIN, SEEN, SHEEN, SAD, DAD,
|
|||
AlphabeticOrder = {ALEF: 1, BEH: 2, TEH: 3, TEH_MARBUTA: 3, TH
|
|||
HARAKAT_pattern = re.compile(ur"["+ u"".join(TASHKEEL)+ u"]")
|
|||
HAMZAT_pattern = re.compile(ur"["+ u"".join(HAMZAT)+ u"]")
|
|||
ALEFAT_pattern = re.compile(ur"["+ u"".join(ALEFAT)+ u"]")
|
|||
LIGUATURES_pattern = re.compile(ur"["+ u"".join(LIGATURES)+ u"]")
|
|
Checks for Arabic Sukun Mark.
|
Checks for Arabic Shadda Mark.
|
Checks for Arabic Tatweel letter modifier.
|
Checks for Arabic Tanwin Marks (FATHATAN, DAMMATAN, KASRATAN).
|
Checks for Arabic Tashkeel Marks (FATHA,DAMMA,KASRA, SUKUN, SHADDA, FATHATAN,DAMMATAN, KASRATAn).
|
Checks for Arabic Harakat Marks (FATHA,DAMMA,KASRA,SUKUN,TANWIN).
|
Checks for Arabic short Harakat Marks (FATHA,DAMMA,KASRA,SUKUN).
|
Checks for Arabic Ligatures like LamAlef. (LAM_ALEF, LAM_ALEF_HAMZA_ABOVE, LAM_ALEF_HAMZA_BELOW, LAM_ALEF_MADDA_ABOVE)
|
Checks for Arabic Hamza forms. HAMZAT are (HAMZA, WAW_HAMZA, YEH_HAMZA, HAMZA_ABOVE, HAMZA_BELOW,ALEF_HAMZA_BELOW, ALEF_HAMZA_ABOVE )
|
Checks for Arabic Alef forms. ALEFAT=(ALEF, ALEF_MADDA, ALEF_HAMZA_ABOVE, ALEF_HAMZA_BELOW,ALEF_WASLA, ALEF_MAKSURA );
|
Checks for Arabic Yeh forms. Yeh forms : YEH, YEH_HAMZA, SMALL_YEH, ALEF_MAKSURA
|
Checks for Arabic Waw like forms. Waw forms : WAW, WAW_HAMZA, SMALL_WAW
|
Checks for Arabic Teh forms. Teh forms : TEH, TEH_MARBUTA
|
Checks for Arabic Small letters. SMALL Letters : SMALL ALEF, SMALL WAW, SMALL YEH
|
Checks for Arabic Weak letters. Weak Letters : ALEF, WAW, YEH, ALEF_MAKSURA
|
Checks for Arabic Moon letters. Moon Letters :
|
Checks for Arabic Sun letters. Moon Letters :
|
return Arabic letter order between 1 and 29. Alef order is 1, Yeh is 28, Hamza is 29. Teh Marbuta has the same ordre with Teh, 3.
|
return Arabic letter name in arabic. Alef order is 1, Yeh is 28, Hamza is 29. Teh Marbuta has the same ordre with Teh, 3.
|
return a list of arabic characteres . Return a list of characteres between \u060c to \u0652
|
Checks if the arabic word contains shadda.
|
Checks if the arabic word is vocalized. the word musn't have any spaces and pounctuations.
|
Checks if the arabic text is vocalized. The text can contain many words and spaces
|
Checks for an Arabic Unicode block characters;
|
Checks for an valid Arabic word. An Arabic word
|
Strip Harakat from arabic word except Shadda. The striped marks are :
Example: >>> text=u"الْعَرَبِيّةُ" >>> stripTashkeel(text) العربيّة
|
Strip vowels from a text, include Shadda. The striped marks are :
Example: >>> text=u"الْعَرَبِيّةُ" >>> stripTashkeel(text) العربية
|
Strip tatweel from a text and return a result text. Example: >>> text=u"العـــــربية" >>> stripTatweel(text) العربية
|
Normalize Lam Alef ligatures into two letters (LAM and ALEF), and Tand return a result text. Some systems present lamAlef ligature as a single letter, this function convert it into two letters, The converted letters into LAM and ALEF are :
Example: >>> text=u"لانها لالء الاسلام" >>> normalize_lamalef(text) لانها لالئ الاسلام
|
return True if the given word have the same or the partial vocalisation like the pattern vocalized
|
|
LETTERS
|
TASHKEEL
|
HARAKAT
|
LIGUATURES
|
HAMZAT
|
ALEFAT
|
MOON
|
SUN
|
AlphabeticOrder
|
Home | Trees | Indices | Help |
|
---|
Generated by Epydoc 3.0.1 on Mon Mar 01 18:27:26 2010 | http://epydoc.sourceforge.net |