This document is copied from Feishu document for search purposes. It may contain incompatible content formats. It is recommended to refer to the original Feishu document
Background#
As a free user who often tweets on Twitter, I often encounter the limitation of Twitter tweets when I want to post slightly longer content.
At this time, you need to manually split the content, and in order to make the split tweet content clearer, you usually add a prefix like "(x/x)" to indicate the progress of the tweet splitting. If you want to add some new content during the editing process, it is easy to encounter the need to readjust the tweet splitting. So when I encountered this scenario more and more, I thought of making a script specifically for this scenario to help me automate this process.
This article will briefly describe the process and design of writing this script. Although the implementation difficulty of the script is not high, it is mainly intended to be used as an output to reflect and summarize the projects I have developed through outputting blogs. At the same time, I hope it can help people with similar needs.
Repo address: https://github.com/catwithtudou/x_tool
Functional Requirements#
One-sentence description of the function: Input a formatted text, automatically split it if it exceeds the length limit of a Twitter tweet, and add a splitting progress prefix.
Here are some points to note:
- During the splitting process, the original formatting of the text should not be adjusted, and the tweet content should be filled as much as possible.
- Chinese characters need to be handled properly to avoid garbled characters due to the special length of Chinese characters.
Key Design#
Length Limit of Tweets#
First, you need to understand the current length limit of Twitter tweets in order to facilitate the subsequent character length calculation:
- The length limit is 280 characters, excluding links and images.
- Chinese characters are counted as two characters, while other characters (including punctuation, spaces, line breaks, etc.) are counted as one character.
Character Length Calculation in Go#
Here we need to know some preliminary information about Go:
- Strings are stored in UTF-8 encoding by default, with each Chinese character occupying 3 bytes.
- The rune data type array can be used to store strings, where the rune type represents Unicode encoding, and each Chinese character occupies one length.
So based on the above information, we can calculate the character length of a string in a Twitter tweet using the following code:
func calculateTwitterContentLen(content string) int {
// The utf8.RuneCountInString method can get the length of the string converted to a rune array
runeCount := utf8.RuneCountInString(content)
return runeCount + (len(content)-runeCount)/2
}
Binary Search for Character Truncation Position#
If the tweet content exceeds the length limit, find the first substring in the string that has a length equal to the limit. Please note:
- The length referred to here is the "character length in the tweet".
- Before searching for the character truncation position, the length of the prefix progress (such as "(1/2)") characters needs to be calculated.
- When truncating, avoid garbled Chinese characters caused by incomplete bytes of Chinese characters.
To improve the search performance, a binary search is used here to find the first position in the string that satisfies the length limit of a tweet. Therefore, the code is as follows:
// splitFirstMaxTwitterContent finds the first truncation position and outputs the cut left and right strings
func splitFirstMaxTwitterContent(content string) (left string, right string) {
contentLen := len(content)
if contentLen <= twitterMaxLength {
return content, ""
}
// It can be seen that the length calculation here is based on Unicode to avoid garbled Chinese characters
runeContent := []rune(content)
runeIdx := sort.Search(len(runeContent), func(i int) bool {
_, isExceed := calculateTwitterRuneContentLen(runeContent[:i])
return isExceed
})
return string(runeContent[:runeIdx]), string(runeContent[runeIdx:])
}
func calculateTwitterRuneContentLen(runeContent []rune) (int, bool) {
tweetLen := len(runeContent) + (len(string(runeContent))-len(runeContent))/2
return tweetLen, tweetLen >= twitterMaxLength
}
Instructions for Use#
- Run the compiled main program.
- Enter the content of the tweet according to the prompt in the terminal. Enter "exit" on a new line to indicate the end of input.
- Finally, if the tweet content does not exceed the length limit, the original text will be returned. If it exceeds the limit, the program will output the automatically truncated tweet content with the prefix progress.
./main
===================================================
Please enter the content of the tweet (finally enter "exit" to end the input):
This will be a test text: The following sentences are excerpted from "Zhengjian".
Subconsciously, we expect ourselves to reach a state where we no longer need to repair anything. One day, we will "live a happy life from now on". We firmly believe in the concept of "solving". It seems that everything we have experienced, our life up to this moment, is just a rehearsal. The grand performance has not yet begun. For most people, this endless process of handling, rearranging, and updating versions is the definition of life.
We often think this way: when we die, the world will still exist. The same sun will continue to illuminate the earth, and the same planet will continue to rotate, because we think that since the beginning of time, they have always been like this. Our children will inherit this earth. All of this shows how ignorant we are of the constantly changing world and all phenomena.
exit
===================================================
Current Tweet Length: 566
Exceed the Twitter length limit: true
===================================================
Now back to the cut tweet content
>>>>>>>「the 1th tweet content」
(1/3)This will be a test text: The following sentences are excerpted from "Zhengjian".
Subconsciously, we expect ourselves to reach a state where we no longer need to repair anything. One day, we will "live a happy life from now on". We firmly believe in the concept of "solving". It seems that everything we have experienced, our life up to this moment, is just a rehearsal. The grand performance has not yet begun. For most people, this endless process of handling
>>>>>>>「the 2th tweet content」
(2/3)and rearranging, and updating versions is the definition of life.
We often think this way: when we die, the world will still exist. The same sun will continue to illuminate the earth, and the same planet will continue to rotate, because we think that since the beginning of time, they have always been like this. Our children will inherit this earth. All of this shows how ignorant we are of the constantly changing world and all
>>>>>>>「the 3th tweet content」
(3/3)phenomena.
>>>>>>>「End of tweet cutting」
Future Extensions#
- Terminal interaction optimization
- Automatic tweet publishing
- Formatting optimization
- ....