8.appendix added.

2024-08-25 12:23:34 +08:00
parent e5bfa87753
commit b78bab86b0
33 changed files with 1057 additions and 720 deletions
--- a/1000-hours/sounds-of-american-english/1-phonemes.md
+++ b/1000-hours/sounds-of-american-english/1-phonemes.md
@@ -359,9 +359,9 @@
 </tr>
 <tr>
 <td><span class="pho">ŋ</span><span class="speak-word-inline" data-audio-uk-male="/audios/uk_phonetics_sound_sing_2023feb.mp3"></span></td>
-<td><b>th</b>ank <span class="pho alt not-display">θæŋk</span><span class="speak-word-inline" data-audio-uk-female="/audios/thank-uk-female.mp3" data-audio-uk-male="/audios/thank-uk-male.mp3"></span></td>
+<td>tha<b>n</b>k <span class="pho alt not-display">θæŋk</span><span class="speak-word-inline" data-audio-uk-female="/audios/thank-uk-female.mp3" data-audio-uk-male="/audios/thank-uk-male.mp3"></span></td>
 <td><span class="pho">ŋ</span><span class="speak-word-inline" data-audio-us-male="/audios/us_phonetics_sound_sing_2023feb.mp3"></span></td>
-<td><b>th</b>ank <span class="pho alt not-display">θæŋk</span><span class="speak-word-inline" data-audio-us-female="/audios/thank-us-female.mp3" data-audio-us-male="/audios/thank-us-male.mp3"></span></td>
+<td>tha<b>n</b>k <span class="pho alt not-display">θæŋk</span><span class="speak-word-inline" data-audio-us-female="/audios/thank-us-female.mp3" data-audio-us-male="/audios/thank-us-male.mp3"></span></td>
 </tr>
 <tr>
 <td><span class="pho">l</span><span class="speak-word-inline" data-audio-uk-male="/audios/uk_phonetics_sound_look_2023feb.mp3"></span></td>
--- a/1000-hours/sounds-of-american-english/3.2.11-mnŋ.md
+++ b/1000-hours/sounds-of-american-english/3.2.11-mnŋ.md
@@ -28,9 +28,9 @@
 </tr>
 <tr>
 <td><span class="pho">ŋ</span><span class="speak-word-inline" data-audio-uk-male="/audios/uk_phonetics_sound_sing_2023feb.mp3"></span></td>
-<td><b>th</b>ank <span class="pho alt">θæŋk</span><span class="speak-word-inline" data-audio-uk-female="/audios/thank-uk-female.mp3" data-audio-uk-male="/audios/thank-uk-male.mp3"></span></td>
+<td>tha<b>n</b>k <span class="pho alt">θæŋk</span><span class="speak-word-inline" data-audio-uk-female="/audios/thank-uk-female.mp3" data-audio-uk-male="/audios/thank-uk-male.mp3"></span></td>
 <td><span class="pho">ŋ</span><span class="speak-word-inline" data-audio-us-male="/audios/us_phonetics_sound_sing_2023feb.mp3"></span></td>
-<td><b>th</b>ank <span class="pho alt">θæŋk</span><span class="speak-word-inline" data-audio-us-female="/audios/thank-us-female.mp3" data-audio-us-male="/audios/thank-us-male.mp3"></span></td>
+<td>tha<b>n</b>k <span class="pho alt">θæŋk</span><span class="speak-word-inline" data-audio-us-female="/audios/thank-us-female.mp3" data-audio-us-male="/audios/thank-us-male.mp3"></span></td>
 </tr>
 </tbody>
 </table>
--- a/1000-hours/sounds-of-american-english/8-appendix.md
+++ b/1000-hours/sounds-of-american-english/8-appendix.md
@@ -0,0 +1,3 @@
+# 附录
+
+这里补充的是一些日常可以使用的桌面版工具。
--- a/1000-hours/sounds-of-american-english/8.1-inputting-phonemes-and-symbols.md
+++ b/1000-hours/sounds-of-american-english/8.1-inputting-phonemes-and-symbols.md
@@ -0,0 +1,63 @@
+# 8.1. 输入音标与特殊符号
+
+在电子文档中输入音标符号（及其其它特殊符号）从来都很麻烦。
+
+再一次，我用 Alfred 作为辅助，以下是 workflow 文件：
+
+> [IPA-Phonetic-Symbols](https:///1000h.org/public/alfred-workflows/IPA-Phonetic-Symbols.alfredworkflow)
+
+以启动关键字 `ipae` 为例 —— 呼出 Alfred：
+
+![ipae](/images/ipae.png)
+
+这时，就可以用 `CMD + 数字` 的方式，将对应的符号插入当前文本。比如，`CMD + 4` 就是将 <span class="pho">ɝː</span> 插入当前文本编辑器。
+
+以下罗列的是各个符号对应的 Alfred 关键字（Keywords）：
+
+| 关键字（Keyword） | 符号（Symbol） |
+| ----- | ----- |
+| `ipaa`  |  <span class="pho">ʌ</span>  |
+| `ipaaa`  |  <span class="pho">ɑ</span>  |
+| `ipaae`  |  <span class="pho">æ</span>  |
+| `ipae`  |  <span class="pho">ə</span>  |
+| `ipaeeer`  |  <span class="pho">ɝː</span>  |
+| `ipaer`  |  <span class="pho">ɚ</span>  |
+| `ipaes`  |  <span class="pho">ᵊ</span>  |
+| `ipai`  |  <span class="pho">ɪ</span>  |
+| `ipau`  |  <span class="pho">ʊ</span>  |
+| `ipao`  |  <span class="pho">ɒ</span>  |
+| `ipaoo`  |  <span class="pho">ɔ</span>  |
+| `ipal`  |  <span class="pho">ɤ</span>  |
+| `ipatd`  |  <span class="pho">t̠</span>  |
+| `ipatg`  |  <span class="pho">ʔ</span>  |
+| `ipats`  |  <span class="pho">ᵗ</span>  |
+| `ipan`  |  <span class="pho">ŋ</span>  |
+| `ipath`  |  <span class="pho">θ</span>  |
+| `ipad`  |  <span class="pho">ð</span>  |
+| `ipas`  |  <span class="pho">ʃ</span>  |
+| `ipaz`  |  <span class="pho">ʒ</span>  |
+| `ipaj`  |  <span class="pho">ʲ</span>  |
+| `ipaw`  |  <span class="pho">ʷ</span>  |
+| `ipa1`  |  <span class="pho">◌̅</span> flat  |
+| `ipa2`  |  <span class="pho">◌́</span> rise  |
+| `ipa3`  |  <span class="pho">◌̌</span> fall-rise  |
+| `ipa4`  |  <span class="pho">◌̀</span> fall  |
+| `ipa5`  |  <span class="pho">◌̂</span> pitch raise  |
+| `ipa6`  |  <span class="pho">◌̲</span> long vowel  |
+| `ipa7`  |  <span class="pho">◌̩</span> syllabic consonant  |
+| `ipa8`  |  <span class="pho">◌̥</span> voiceless  |
+| `ipa9`  |  <span class="pho">◌̚</span> stop  |
+| `ipa0`  |  <span class="pho">◌</span>  |
+| `ipa`:  |  <span class="pho">ː</span> long vowel symbol  |
+| `ipa`"  |  <span class="pho">ˈ</span> prime stress  |
+| `ipa`'  |  <span class="pho">ˌ</span> secondary stress  |
+| `ipa-`  |  <span class="pho">◌‿◌</span> linking  |
+| `ipa\|` |  <span class="pho">‖</span> grouping boundary  |
+| `-->`  |  <span class="pho">⭢</span>  |
+| `<--`  |  <span class="pho">⭠</span>  |
+| `<->`  |  <span class="pho">⭤</span>  |
+| `irise`  |  <span class="pho">⤴</span> senetence intonation rise  |
+| `idown`  |  <span class="pho">⤵</span> senetence intonation fall  |
+
+
+
--- a/1000-hours/sounds-of-american-english/8.2-cepd-phonetics-and-sound.md
+++ b/1000-hours/sounds-of-american-english/8.2-cepd-phonetics-and-sound.md
@@ -0,0 +1,110 @@
+# 8.2. 获取 CEPD 音标
+
+macOS 上有一个收费软件，[Alfred](https://www.alfredapp.com/)，可以用来定义很多快捷流程（workflow）去完成相对复杂的任务。比如，通过设定关键字启动一个 Python 脚本，查询某个单词（甚至整个句子）在《剑桥英语发声词典》（CEPD）中的音标。
+
+> Alfred 的使用方法，参见：
+> https://github.com/xiaolai/apple-computer-literacy/blob/main/alfred.md
+
+在 Github 上有一个开源的仓库，提供了《剑桥英语发声词典》的 json 格式数据库：
+
+> https://github.com/zelic91/camdict
+
+将这个仓库里的 [cam_dict.refined.json](https://github.com/zelic91/camdict/raw/main/cam_dict.refined.json) 下载并保存到本地某个位置。
+
+我写了一个 Alfred 的 workflow，使用的是 macOS 系统自带的 python3：`/usr/bin/python3`：
+
+> [CEPD-phonetic-transcription.alfredworkflow](https:///1000h.org/public/alfred-workflows/CEPD-phonetic-transcription.alfredworkflow)
+
+下载这个文件之后，导入 Alfred。
+
+在使用之前要注意：
+
+> * 修改各个 Python 脚本内的 `cam_dict.refined.json` 的文件路径
+
+这个 workflow 可用的启动关键字分别是：
+
+> * `cams`：查询音标（美式发音）
+> * `camk`：查询音标（英式发音）
+> * `camsd`：用浏览器打开 CEPD 真人示范录音（美式发音）在线网址
+> * `camsd`：用浏览器打开 CEPD 真人示范录音（英式发音）在线网址
+> * `camw`：用浏览器打开 CEPD 查询页面
+> * `ipa`：返回 CMU（卡耐基梅隆大学）音标库中的音标
+
+以下是查询音标的 workflow（启动关键字为 `cams`）中的 python 脚本：
+
+```python
+#
+# NOTE: Python 2 is deprecated in macOS, and has been removed from macOS 12.3+
+#
+import sys
+import json
+
+# 假设你的 JSON 数据库是一个 JSON 文件，我们将从文件中加载数据
+# 如果 JSON 数据在内存中或其他格式，你可能需要修改这部分代码
+def load_json_database(file_path):
+    records = []
+    with open(file_path, 'r') as file:
+        for line in file:
+            try:
+                record = json.loads(line)
+                records.append(record)
+            except json.JSONDecodeError as e:
+                print(f"Error parsing JSON: {e}")
+    return records
+
+# 在 JSON 数据库中检索 word
+def search_in_json_database(database, search_word, region):
+    for record in database:
+        # 检查 word 字段是否匹配
+        if record.get('word') == search_word:
+            # 找到匹配项后，获取美式发音信息
+            pos_items = record.get('pos_items', [])
+            for pos_item in pos_items:
+                pronunciations = pos_item.get('pronunciations', [])
+                for pronunciation in pronunciations:
+                    if pronunciation.get('region') == region:
+                        # 找到美式发音，返回相关信息
+                        return {
+                            'pronunciation': pronunciation.get('pronunciation'),
+                            'audio': pronunciation.get('audio')
+                        }
+    # 如果没有找到匹配的 word 字段，返回 'not exist'
+    return 'not exist'
+
+# cam_dict.refined.json 的文件路径
+json_db_file_path = '/Users/joker/github/camdict/cam_dict.refined.json'
+
+# 要检索的单词
+search_word = sys.argv[1]
+
+region = "us"
+
+json_database = load_json_database(json_db_file_path)
+
+# replace punctuations in text with space
+punctuations = ",.?!;"
+for p in punctuations:
+    search_word = search_word.replace(p, " ")
+words = [word for word in search_word.split() if word.strip() != '']
+
+phonetics = []
+
+for w in words:
+  # 检索并获取结果
+  w = w.strip().lower()
+
+  if w[-1] in punctuations:
+    w = w.rstrip(",.?!;")
+  result = search_in_json_database(json_database, w, region)
+
+  if result == 'not exist':
+    phonetics.append(w+"*")
+  else:
+    phonetics.append(result['pronunciation'])
+
+returnvalue = ''
+for p in phonetics:
+  returnvalue += p + ' '
+
+sys.stdout.write(returnvalue.strip())
+```
--- a/1000-hours/sounds-of-american-english/8.3-phoneme-exercises.md
+++ b/1000-hours/sounds-of-american-english/8.3-phoneme-exercises.md
@@ -0,0 +1,127 @@
+# 8.3. 音标练习
+
+这是一个 Jupyter Notebook，用来建立音标符号与声音之间的关联。
+
+每次执行，随即从《剑桥英语发声词典》中选取一个词汇，播放真人朗读语音，而后要求对元音或者辅音填空……
+
+> [phonetics-fill-in-exercise.ipynb](https://1000h.org/public/jupyter-notebooks/phonetics-fill-in-exercise.ipynb)
+
+执行结果如下：
+
+![phoneme-exercises.png](/images/phoneme-exercises.png)
+
+Jupyter Notebook 代码如下：
+
+``` Python
+# %%
+%pip install python-vlc
+
+# %%
+import requests
+import json
+import vlc
+import re
+import random
+from IPython.display import Audio
+
+import json
+import requests
+
+def load_json_database(source):
+    records = []
+    
+    def parse_json_lines(lines):
+        for line in lines:
+            if line:
+                try:
+                    record = json.loads(line)
+                    records.append(record)
+                except json.JSONDecodeError as e:
+                    print(f"Error parsing JSON: {e}")
+
+    try:
+        if source.startswith('http://') or source.startswith('https://'):
+            # Handle as URL
+            response = requests.get(source)
+            response.raise_for_status()  # Raise an error for bad status codes
+            parse_json_lines(response.iter_lines(decode_unicode=True))
+        else:
+            # Handle as file
+            with open(source, 'r', encoding='utf-8') as file:
+                parse_json_lines(file)
+    except requests.exceptions.RequestException as e:
+        print(f"Error fetching data from URL: {e}")
+    except FileNotFoundError as e:
+        print(f"Error opening file: {e}")
+    except Exception as e:
+        print(f"An unexpected error occurred: {e}")
+    
+    return records
+
+url = "https://raw.githubusercontent.com/zelic91/camdict/main/cam_dict.refined.json"
+json_database = load_json_database(url)
+
+
+# %%
+def search_in_json_database(database, search_word, region):
+    for record in database:
+        # 检查 word 字段是否匹配
+        if record.get('word') == search_word:
+            # 找到匹配项后，获取美式发音信息
+            pos_items = record.get('pos_items', [])
+            for pos_item in pos_items:
+                pronunciations = pos_item.get('pronunciations', [])
+                for pronunciation in pronunciations:
+                    if pronunciation.get('region') == region:
+                        # 找到美式发音，返回相关信息
+                        return {
+                            'pronunciation': pronunciation.get('pronunciation'),
+                            'audio': pronunciation.get('audio')
+                        }
+    # 如果没有找到匹配的 word 字段，返回 'not exist'
+    return 'not exist'
+
+def replace_with_underscores(match):
+    return '_' * len(match.group(0))
+
+# %%
+# get a random word from the database
+
+vowel_phonetics = re.compile(r'ɑː|ɑːr|ʌ||iː|ɪ|i|ɪr|ʊ|ʊr|uː|ʊr|e|er|æ|ə|ɚ|ɝː|ɒ|ɔː|ɔːr|ɔɪ|aɪ|aɪr|eɪ|aʊ|aʊr|oʊ|')
+consonant_phonetics = re.compile(r'p|b|t|d|k|ɡ|f|v|θ|ð|s|z|ʃ|ʒ|tʃ|dʒ|r|h|l|t̬|j|w|ŋ|n|m|tr|dr|ts|dz|br|pr|fr|ɡr|θr|dr|ʃr|kr|bl|kl|ɡl|fl|pl|sl|sp|st|sk|sm|sn|sw|str|spr|skr|spl|sfr|skw|skr|skl|')
+
+# if the word is with certain enddings such as 'es, ed, ing', get another word
+random_word = random.choice(json_database)
+while random_word['word'].endswith(('ed', 'ing', 'es', 'ts', 'ks', 'ds', 'ps', 'bs', 'gs', 'ls', 'rs', 'ms', 'ns', 'er', 'est')):
+    random_word = random.choice(json_database)
+
+# get pronunciation of the random word with region 'us'
+random_word_us = search_in_json_database(json_database, random_word['word'], 'us')
+
+# get the word's phonetics
+random_word_entry = random_word['word']
+print(random_word_entry)
+
+random_word_phonetics = random_word_us['pronunciation']
+
+# get the audio url of the word
+random_word_us_audio_url = random_word_us['audio']
+print(random_word_us_audio_url)
+
+blank_vowel_phonetics = re.sub(vowel_phonetics, replace_with_underscores, random_word_phonetics)
+blank_consonant_phonetics = re.sub(consonant_phonetics, replace_with_underscores, random_word_phonetics)
+
+# fill vowels in blanks
+print(f'Fill vowels in blanks: {blank_vowel_phonetics}')
+
+# fill consonants in blanks
+print(f'Fill in consonants in blanks: {blank_consonant_phonetics}')
+
+# play the audio
+player = vlc.MediaPlayer(random_word_us['audio'])
+player.play()
+
+# display the audio
+Audio(url=random_word_us_audio_url)
+```
+
--- a/1000-hours/sounds-of-american-english/8.4-daily-speech-exercises.md
+++ b/1000-hours/sounds-of-american-english/8.4-daily-speech-exercises.md
@@ -0,0 +1,10 @@
+# 8.4. 每日练习语音生成
+
+这是一个 Jupyter Notebook —— 需要有自己的 OpenAI API Key。指定 `user-prompt`，而后生成
+
+> * 一个篇章及其 markdown 文件，以及由 alloy 和 nova 朗读的 mp3 文件
+> * 同一话题的两个对话，及其 mp3 文件
+
+压缩包链接：
+
+> [8.4-daily-speech-exercises.zip](https://1000h.org/public/jupyter-notebooks/8.4-daily-speech-exercises.zip)