Sie können Operatoren mit Ihrer Suchanfrage kombinieren, um diese noch präziser einzugrenzen. Klicken Sie auf den Suchoperator, um eine Erklärung seiner Funktionsweise anzuzeigen.
Findet Dokumente, in denen beide Begriffe in beliebiger Reihenfolge innerhalb von maximal n Worten zueinander stehen. Empfehlung: Wählen Sie zwischen 15 und 30 als maximale Wortanzahl (z.B. NEAR(hybrid, antrieb, 20)).
Findet Dokumente, in denen der Begriff in Wortvarianten vorkommt, wobei diese VOR, HINTER oder VOR und HINTER dem Suchbegriff anschließen können (z.B., leichtbau*, *leichtbau, *leichtbau*).
Dieses Kapitel geht auf die häufigen Fehler ein, die neue Doktoranden bei der Präsentation von Forschungsarbeiten machen, und bietet praktische Ratschläge, wie sie diese vermeiden können. Es betont die Wichtigkeit des Schreibens aus der Perspektive des Lesers, die Gewährleistung klarer Motivation und die Aufrechterhaltung von Konsistenz in Terminologie und Zahlen. Das Kapitel hebt auch die Bedeutung hervor, eine überzeugende Forschungsgeschichte zu erzählen und als guter Verkäufer für Ihre Forschungsideen zu agieren. Darüber hinaus werden die Fallstricke des Lösungsvorschlags ohne klares Problem, die Wichtigkeit des Zeichnens professioneller Figuren und die Notwendigkeit der sorgfältigen Auswahl von Notationen diskutiert. Das Kapitel schließt mit Einsichten darüber, wie Unterschiede in verwandten Arbeiten effektiv zusammengefasst und identifiziert werden können und wie wichtig es ist, klare Motivationen für vorgeschlagene Methoden bereitzustellen. Indem es sich mit diesen Schlüsselbereichen befasst, zielt das Kapitel darauf ab, Forschern zu helfen, die Klarheit, Wirkung und Professionalität ihrer Forschungsarbeiten zu verbessern.
KI-Generiert
Diese Zusammenfassung des Fachinhalts wurde mit Hilfe von KI generiert.
Abstract
In this chapter, we discuss some common mistakes for presenting research papers that have been made by new postgraduate students (including us in the early stage of career), which have been categorized into the following thirteen types.
In this chapter, we discuss some common mistakes for presenting research papers that have been made by new postgraduate students (including us in the early stage of career), which have been categorized into the following thirteen types.
5.1 Never Write as a Reader
Many students may write papers only based on their own views, which means that they may assume that readers can understand what they are writing/thinking (see Fig. 5.1). However, we need to emphasize that readers are not smart and know everything, especially for the fast developing field like computer science. As an example, everyone talked about IoT/blockchain from 2019 to 2020, computational epidemiology from 2020 to 2022 (in the COVID-19 period), LLM from 2023 to now, and AI for science from 2024 to now. Therefore, it is impossible for every reader to master the knowledge from a short period of time. Instead, many researchers can only stay in their own fields. Using the first author of this book as an example, he is dedicated to developing efficient algorithms for GIS operations and he knows nothing in other fields. Moreover, even for GIS, he cannot know everything, i.e., he only understands some specific parts, e.g., kernel density visualization and K-function, in GIS. Under this circumstance, students should assume that readers only understand minimal knowledge (e.g., assume that they just get the bachelor degree in computer science). When students write each paragraph (or even one sentence) of research papers, they need to keep asking whether readers can understand what they are writing (see Fig. 5.1).
Fig. 5.1
Instead of simply writing what you think, you need to write a paper using the view from a reader
Normally, there are several reasons for why readers cannot understand a paper, which are summarized as follows.
The motivation is unclear. Some (junior) students write a research paper for solving a research problem without having any motivation for why they need to solve it. Consider the following example.
“Kernel Density Visualization (KDV) [1] has many applications, including crime hotspot detection [2], traffic accident hotspot detection [3], and disease outbreak detection [4]. However, this operation is very slow. Therefore, we need to propose efficient methods for solving this problem.”
After the student writes this paragraph for the introduction, he/she does not know what he/she should write next and sends it back to the supervisor. Then, the supervisor can ask many questions. First, what is Kernel Density Visualization (KDV)? The student should not assume that everyone knows the concept of KDV. Instead, he/she should clearly adopt an intuitive way to illustrate this concept (by drawing a figure like Fig. 4.1). Second, who are users? The student should clearly state this since readers may not understand who are using this tool. Third, how can KDV handle crime hotspot detection, traffic accident hotspot detection, and disease outbreak detection tasks? The student should clearly mention how these tasks can be handled by KDV using some words that can be easily understood by layman (instead of simply listing all these tasks). Fourth, is it necessary to support KDV? The student should realize that other tools (e.g., histogram) can also be used as hotspot analysis. He/she should motivate the importance of solving KDV. For example, many software packages also need to handle KDV. Fifth, how slow is the operation? Readers can have no concept for how slow KDV is. Therefore, the student should provide further evidences and the quantitative values so that readers can understand why it is important to solve this problem. By asking the above questions regarding the motivation. We can edit this paragraph in this way.
Anzeige
“Kernel Density Visualization (KDV) [1] has been extensively used in different domains, including criminology, transportation science, and epideminology. Criminologists and transportation scientists [2, 3] adopt KDV to discover crime and traffic/traffic accident hotspots in different regions. Epideminologists [4] adopt KDV to detect disease outbreaks in various geographical regions. Figure 5.2 shows how KDV can be utilized to generate a hotspot map using the Los Angeles crime dataset (i.e., those yellow points). Observe that the hotspot region is colored by red, which indicates that this is the dangerous zone. Since many domain experts have adopted KDV for performing visual analysis, many software packages, e.g., QGIS [5], ArcGIS [6], Scikit-learn [7], Scipy [8], and Seaborn [9], have already supported this tool.
Fig. 5.2
A hotspot map (based on KDV) for crime events in Los Angeles, where the red region is a hotspot. (Obtained from Fig. 1 in “Tsz Nam Chan, Pak Lon Ip, Bojian Zhu, Leong Hou U, Dingming Wu, Jianliang Xu, Christian S. Jensen. Large-scale Spatiotemporal Kernel Density Visualization. ICDE 2025”)
However, KDV is a computationally expensive operation, which takes O(XYn) time, where \(X\times Y\) and n denote the resolution size and the number of data points (see those yellow points in Fig. 5.2a), respectively. Consider this Los Angeles crime dataset (with 1.26 million data points) as an example. Generating a \(1280\times 960\)-resolution KDV takes 1.548 trillion operations. Therefore, KDV does not scale to support large datasets and high-resolution sizes. To address the efficiency issue of KDV, we need to develop efficient algorithms.”
By comparing these two versions, we note that the second version should be much more clear for readers to understand why we need to solve this problem. Students need to keep asking the above question types so that they can write papers with clear motivation.
The connection between paragraphs is broken. Some junior students may write some paragraphs that are not connected with each other. They may discuss A in the first paragraph and then discuss B in the second paragraph. However, A and B may not be straightly related and these two paragraphs do not have any connection in between. Here, we consider the following example.
“Kernel Density Visualization (KDV) [1] has been extensively used in different domains, including criminology, transportation science, and epideminology. Criminologists and transportation scientists [2, 3] adopt KDV to discover crime and traffic/traffic accident hotspots in different regions. Epideminologists [4] adopt KDV to detect disease outbreaks in various geographical regions.
In this paper, we would like to improve the efficiency of KDV by proposing the method XXX, which can significantly improve the efficiency by XXX compared with the existing methods.”
Fig. 5.3
This is how the first author of this book draws the map in a paper to illustrate why we need to propose another efficient solution for solving the KDV problem
In the first paragraph above, it discusses the applications and the users of KDV. However, in the second paragraph, it suddenly changes to discuss the new method that can improve the efficiency over the existing methods. Therefore, we can note that there is no connection in between (i.e., we cannot understand why the first paragraph can link to the second paragraph). Here, we need to emphasize that this mistake is very common for junior students (i.e., inexperienced writers), including us when we were young, because their brains can easily be in chaos (or blank) when they write the first few papers. Here, we would like to suggest that students can get several blank papers and draw the map for connecting somethings that they want to discuss. With this linkage (see Fig. 5.3), it can help students write papers in a more logically way.
Directly discuss the details without providing intuitive illustration. When some students write their new solution in one section, they simply discuss the details step-by-step. As an example, suppose that the student develops one algorithm with 20 lines, he/she simply discusses each line of the algorithm, which is similar to the following context.
“In line 1, our method scans all those data points and augment some additional terms on them. The time complexity of this line is O(n). In line 2 to line 5, ...”
Here, we need to point out that a research paper is not a manual of a machine. Readers can feel extremely bored and can skip all these details (and of course do not understand) when they read somethings like the above context. Therefore, the authors should intuitively explain the concepts. Consider our recent (top-tier) research paper “Yue Zhong, Tsz Nam Chan, Leong Hou U, Dingming Wu, Wei Tu, Ruisheng Wang, Joshua Zhexue Huang. A Fast and Accurate Block Compression Solution for Spatiotemporal Kernel Density Visualization. SIGKDD 2025.” as an example. In this paper, we need to propose the block compression method for reducing the dataset size so that it can improve the efficiency for computing approximate spatiotemporal kernel density visualization, i.e., STKDV (a variant of KDV). Although we also have the algorithm (see Algorithm 1 in that paper), we provide Fig. 5.4 to explain the concept. With the caption and the figure, readers can easily understand how the block compression method works without knowing the details of the algorithm.
Fig. 5.4
The SIGKDD 2025 paper utilizes this figure to illustrate the concept of block compression. (Modified from Fig. 6 in “Yue Zhong, Tsz Nam Chan, Leong Hou U, Dingming Wu, Wei Tu, Ruisheng Wang, Joshua Zhexue Huang. A Fast and Accurate Block Compression Solution for Spatiotemporal Kernel Density Visualization. SIGKDD 2025”)
The paper is not self-contained. A proposed solution can possibly depend on some concepts from existing research papers. Therefore, students may assume that reviewers should have the responsibility (or obligation) to understand those papers in the literature. Otherwise, these reviewers should not have the right to review their papers (see Fig. 5.5). However, this mindset is completely wrong. As pointed out in Fig. 2.12, the knowledge space of a reviewer is very small. For example, we always review papers (more than 99% of papers) that are not from our research areas. Therefore, it is not realistic to assume that reviewers can understand those concepts from existing research papers. Instead, an author should make sure that his/her paper is self-contained, which means the dependency should be clearly explained in the paper. Using the above SIGKDD 2025 research paper as an example, that paper proposes the block compression solution (see Fig. 5.4), which can be combined with the state-of-the-art STKDV algorithms, SWS (published in VLDB 2022) and PREFIX (published in ICDE 2025). At that time, when we wrote this paper, we did not assume that readers have the knowledge about these two papers and provided intuitive examples to illustrate the core ideas of these two methods. Therefore, reviewers think that our paper is well-written and can understand how our paper advances the state-of-the-art solutions (PREFIX and SWS) even though reviewers may be not very familiar with our research paper (the confidence scores are 1, 1, 1, 2, and 2, which range from 1 (the lowest confidence) to 5 (the highest confidence)). Hence, we need to emphasize that this is the responsibility (or obligation) for a student to make a paper self-contained in order to let reviewers accept his/her paper (even though these reviewers may not be familiar with the research topic).
Fig. 5.6
A productive student must be able to tell a good story for motivating a research problem
Many (junior) students may think that computer science researchers only need to develop solutions for solving computational problems and telling a story should not be a job for them. Some of them may even think that telling a story is a kind of not engaging on honest work. Instead, they should develop a good solution so that it can “speak” for itself (or it can “win” compared with other solutions). Here, we need to emphasize that this mindset is wrong. Note that the goal of publishing a research paper is not to demonstrate that the authors are smart (or smarter than others). Instead, the goal is to show that they have solved some important and practical problems, which are raised from the society, by doing somethings (e.g., developing a new algorithm or building a new system) that can advance the state of the art for benefiting users. Therefore, there must be a story behind each research paper and this story can be much more important than the proposed techniques (because the story can ensure whether a research problem is worth studying or not). In other words, if a student can be able to tell a good story (like Fig. 5.6), it should be easy for this student to discover many future research problems (and thus, publish many research papers).
5.3 Never Act as a Salesman for Writing Papers
Many (junior) students may think that writing papers should be a serious issue. Therefore, they think that they should not provide any exaggerated claims and fake/falsified results in research papers. Here, we need to emphasize that we totally agree with this. Note that providing fake/falsified results (this is an ethical issue.) can even be the end of the career of students. However, those students should not have the wrong mindset that they cannot sell a research paper. In fact, selling a research work is an important skill for students. Consider selling apples as an example (see Fig. 5.7). We know that not every apple can be beautiful. Therefore, it is easy for a salesman to sell some apples that are not perfect. A bad salesman only reports the weakness of the apple (i.e., rotten). However, a good salesman can figure out some good things for his/her apple (e.g., pesticide-free, which means that the apple should be healthy).
Based on the above analogy, students need to realize that not every solution is perfect (i.e., every solution is likely to have its weakness.). For example, some solutions may suffer from relatively large space consumption, while other solutions may suffer from relatively high response time. However, reviewers (or buyers) only accept those papers (or buy somethings) that clearly show their advantages. Therefore, the student of a research paper should act as a good salesman for selling their research ideas by highlighting the goodness and mitigating the weakness (see Fig. 5.8).
5.4 Write a Solution Before Having a Clear Problem
Some (junior) students may focus on writing (or thinking of) the solution. For example, they may want to develop a fast indexing method (e.g., a new tree structure) or a new machine learning model for improving the efficiency or the accuracy. However, when we ask them what research problems they are solving. They cannot answer us in a crystal clear way (see Fig. 5.9). For those students, we would like to emphasize that we need to have a research problem first before we can have a solution for this problem. It is impossible to reverse the order because there is no reason (or no motivation) to develop this solution. Therefore, students must clearly write down the problem statement (or the formal problem definition) first before they write the methodology part. Otherwise, the methodology part should be deemed as useless.
Fig. 5.9
Some students conduct research for finding a new solution but they are still unclear about the research problem
Note that some students may have the mindset that a solution should be more important than a problem. The main reason from these students is that a solution can be very “fancy”, e.g., full of mathematical proofs, mathematical derivations, and complicated algorithms, which can somehow show their “intelligence”. Unfortunately, we would like to point out that this mindset is completely wrong. A new research problem can be even more important than a new solution. As a reviewer in top-tier conferences/journals of computer science, we would like to point out that a new research problem can be regarded as a new research direction, which can possibly let many researchers focus on it in the future. To our understanding (see Fig. 5.10), reviewers normally have a higher priority to accept those research papers with new research problems compared with new solutions (especially for conference papers). Here, we further consider the first author of this book as an example. He had written two papers, which are (1) “LARGE: A Length-Aggregation-based Grid Structure for Line Density Visualization” and (2) “Large-scale Spatiotemporal Kernel Density Visualization”. The technique in the paper (1) is mainly some lower and upper bound functions with the grid structure, which are not very novel (since many database researchers also consider using these two techniques before). However, since the research problem of “line density visualization” is the first time to appear in the database community, this paper is very successful to be accepted in VLDB 2025 without any rejection. In contrast, the paper (2) has solid technical contribution, which proposes a new data structure that can theoretically reduce the time complexity of generating an exact spatiotemporal kernel density visualization without theoretically increasing the space complexity. However, since the research problem of “spatiotemporal kernel density visualization” is not new, this paper is deemed to be not novel by reviewers and has been rejected five times before the acceptance in ICDE 2025. Therefore, based on the above discussion, we strongly suggest that students should think of a novel research problem.
5.5 Never Actively Draw Figures
Many students (especially for junior students) may not actively draw figures when they write research papers. Normally, they draw figures only after their supervisors tell them to do so. The main reasons are that (1) drawing a beautiful figure can take a long time compared with typing some words for explaining the concept and (2) they may not have enough experience for drawing figures. With these reasons, they avoid drawing figures in a paper. However, drawing figures is indeed the most important step for writing a good paper based on the following two reasons.
Fig. 5.11
This figure (copied from the New York Times news) is very intuitive, which shows a lot of information without writing even a single word
A figure is worth a thousand words.1 Figure 5.11 shows a G7 meeting in 2017 regarding political/economic issues between different countries. By solely looking at this figure (without writing even a single word), we can immediately obtain a lot of information. As an example, we can know that Angela Merkel (the chancellor in Germany at that time) must have a serious conflict with Donald Trump (the president in the US at that time) during the meeting. As another example, we can also know that this meeting should not be very successful. Since human can easily absorb information from images compared with information from text (i.e., figures are more intuitive compared with text), students should actively draw figures in order to make other people easily understand their papers.
A figure can easily guide a student to write words. When a student wants to explain a complicated concept without drawing a figure, he/she can discover that it is very difficult to write even a single sentence. The main reason is that some concepts are hard to be explained solely by text. However, once he/she has drawn a figure, he/she only needs to write words by referring to that figure, which can make his/her life much easier. Consider Support Vector Machine (SVM), which is a commonly used machine learning model, as an example (see Fig. 5.12). The student can easily explain the concept of maximum margin based on three linear lines (see black dashed lines) in the figure. However, without this figure, SVM is very difficult to be explained in an intuitive way.
Fig. 5.12
Writing the concepts about SVM by students (with and without a figure)
Many (junior) students may not seriously draw figures when they are writing papers. Some of them may think that they are just computer science researchers but not artists. Therefore, they will ask these questions. Why do they need to seriously draw figures? Why not just arbitrarily draw figures? Here, we would like to emphasize that figures can be regarded as the soul of each paper. The main reason is that figures are very intuitive. As such, most of the readers can thoroughly read all figures in the paper. Therefore, if those figures are not seriously drawn, readers (or reviewers) cannot grasp a lot of useful information from that paper and can even have bad impression for authors (given that the figures are really bad). Here, we would like to point out some common mistakes that are made by students.
Fig. 5.13
Figure 1 was drawn by the first author of this book when he wrote the first paper in SSTD 2015. He was seriously blamed by his supervisor at that time
Do not accurately control the font size of the words in a figure. After a (junior) student draws a figure and put it into the PDF, he/she may not care about the font size of words, which can be either too big or too small, in a figure. In the view point of reviewers, they can think that this can make the paper look very strange (or not professional). Using Fig. 5.13 as an example, the font size of those numerical values in the “query matrix q” is too big compared with the font size of the text in the main content. Using Fig. 5.14 as another example, reviewers cannot read this figure clearly since the font size of the labels of the x-axis, the y-axis, and the name of each method is very small. Hence, a rule of thumb for drawing the figure is that the font size of words in a figure should be similar or slightly smaller than the font size of text in a paper.
Fig. 5.14
Figure 11 was drawn by the first author of this book when he wrote the first paper in SSTD 2015. He was seriously blamed by his supervisor at that time
Do not convey enough information. Consider Fig. 5.13 as an example. Figure 1 originally aims to illustrate the template matching problem. However, readers can only see these two big matrices and cannot understand this problem by reading this figure (i.e., only obtain a small amount of information). Therefore, this figure is not useful and must be redrawn.
Put too many details in a figure. Consider the first author of this book as an example. His supervisor asked him to draw some figures to illustrate his algorithm. To respond to this question, he drew Fig. 6 in Fig. 5.15. However, this figure is very complicated, which contains many details, including mathematical inequalities, different branches, and complex terms (e.g., group entry, individual entry, and refinement). Based on this reason, readers cannot understand this figure in a short period of time. However, a figure is supposed to be easily understood, i.e., everyone can know what it conveys immediately. Therefore, if a figure cannot achieve this goal, that is not a successful figure, which must be redrawn.
Fig. 5.15
Figure 6 was drawn by the first author of this book when he wrote the first paper in SSTD 2015. He was seriously blamed by his supervisor at that time
Draw unprofessionally. To be honest, this is very hard to define. But it can be easily identified by many professional (experienced) writers. Using Fig. 5.15 as an example, this figure is very unprofessional. First, some arrows point to other components with different angles. A natural figure should not have these angles (can be either vertical or horizontal lines). Second, some words (e.g., “e” and “smallest LB” in “Pick an entry e with the smallest LB”) stick to the boundary of the box. Third, the first letter of each word in “Group Entry” and “Individual Entry” is capitalized while other components are not. Students are encouraged to (1) learn how to draw professional figures by reading research papers in top-tier conferences/journals and (2) practice more by drawing more figures. By gaining more experience, they will notice that they can ultimately draw some professional figures.
5.7 Never Carefully Choose Notations
Many students may not carefully choose notations when they are writing research papers. Note that carelessly chosen notations can significantly affect the readability of a research paper, which can be easily detected by reviewers. Using the first author of this book as an example, he has already rejected various research papers in the past based on messy notations (leading to the difficulty of understanding those research papers.). There are several types of mistakes for those notations, which can be categorized as follows.
Use a long English word to represent a notation. Some students may not think deeply about the notation for representing each term. Instead, they may simply use the name of that term to be the notation. Using the formula of computing the speed of an object as an example, they can write this formula as follows.
A chemical structure diagram showing a hexagonal benzene ring with alternating double bonds. Attached to the ring are two hydroxyl groups (OH) at the first and second positions, and a carboxyl group (COOH) at the third position. The structure represents a specific organic compound.
Worse still, some terms may have multiple words. As an example, suppose that they want to find the density of objects inside a region A. They can write the equation in this way.
Once we see these equations/notations, we can immediately have the bad impression for those authors. The main reason is that these symbols are not professional and those students do not seriously spend some time (probably at most one to two hours) to address this obvious issue. In a recent TKDE submission that was reviewed by the first author of this book, he has decided to reject that paper after he saw these unprofessional symbols (of course, that paper also suffers from other issues.).
Use different notations for the same term. Some students can be in chaos (may be the first few times for them to write research papers) when they are writing research papers. They can forget the notation that has been defined by them for a term before. Due to this reason, they can use another notation, which is close to the meaning of that term, when they write the paper. As an example, when they want to use the notation for the term “temporal distance (a.k.a. time range/temporal threshold)”, it is possible for them to use these notations, which are t, d, r, and \(\tau \), “interchangeably”. Therefore, many unnecessary notations can be easily identified by readers, which can lead to bad impression (e.g., readers can think that these students are careless and did not proofread the paper). To ensure the consistency, these students should frequently check those notations that have been defined by them before when they need to write equations or proofs.
Reuse the same notation for different terms. When some students write research papers, they do not care about which notations they have already defined before. Once they have encountered the discussion for one term, they directly define a notation that is closest to that term. As an example, when they need to use the notations for representing these two terms, time and temporal distance, they can simply choose the notation t to represent both of them. This can easily happen especially when there are many notations in a paper (e.g., theoretical paper). Note that readers can misunderstand the paper when the same notation refers to different meanings. In order to avoid this issue, once students need to define the notation of one term, we suggest that they should (1) stop for a while, (2) check the notations that was defined in a paper, and (3) decide the suitable notation for that term. With this approach, students can likely avoid reusing the same notation for different terms in a paper.
5.8 Never Care About the Consistency Issues for Writing Papers
Many students do not care about the consistency issues for writing papers. However, consistency issues of a research paper can be easily detected by reviewers in conferences/journals. Once these issues are detected, reviewers (including us) can have bad impression for that paper because these consistency issues can be easily addressed or avoided before submission. Here, we list two common consistency issues for writing computer science papers.
Use different names for the same term. When some students write different sections (maybe even for different paragraphs of the same section) in a paper, they can possibly use different names for the same term. Using graph data management as an example, both “node” and “vertex” have the same meaning. Worse still, some terms can have many names. For example, “template matching”, “pattern matching”, “nearest neighbor search on matrix”, and “subwindow search” refer to the same term in the image processing and pattern recognition communities. Therefore, suppose that students use all these names interchangeably in the same paper, reviewers can notice that this paper is in chaos, especially for those reviewers who are not working on the same area, so that they can find reasons to reject it. Based on the above discussion, those students need to make sure that they use the same name for the same term throughout the same paper (see Fig. 5.16).
Fig. 5.16
A careful student makes sure that he/she uses the same name for the same term throughout the same paper, while a careless student does not care about it
The section titles are not consistent. Some students may not make sure the consistency of writing section titles. Consider the following example for the section titles.
3. Preliminaries
3.1. problem definition
3.2. State-of-the-art Solution
3.3. Indexing Framework: An augmented R-tree structure
Observe that those students cannot ensure the same style for writing those subtitles. “problem definition” are all in lower-case. The first letter of each word in “State-of-the-art Solution” are capitalized. The first letters of those words in “Indexing Framework: An augmented R-tree structure” can be either in lower-case or upper-case. Note that some students may think that these are just small mistakes. Why do we need to care much about them? Here, we need to emphasize that this mindset is incorrect. In the view point of reviewers, they can easily identify this issue because those titles (or subtitles) are normally bold and have the large font sizes. With these careless mistakes, reviewers can think that the authors may not be very careful for writing that paper. Worse still, reviewers can further determine that the authors may not be serious for writing that paper. Suppose that the authors do not care much about their paper. Why do reviewers need to care about it? As such, they can simply find reasons to reject it.
5.9 Never Summarize and Identify Differences for Related Work
Many (junior) students simply list all those research studies in the related work section and do not identify the differences between these studies and their proposed solutions (see Fig. 5.17). Here, we need to point out that it is fine if those students are not familiar with the literature, i.e., they are still at the early stage (exploring stage) of handling a new research problem and have not proposed a new solution. However, if those students are in the middle stage/ending stage of handling this problem, i.e., they are familiar with those existing research studies and have ideas for a new solution, they should not write the related work section in this way. The main reason is that reviewers need to determine whether a research paper can advance the state of the art. If the related work section does not clearly show the differences between the proposed solution and existing methods, reviewers can think that it is not necessary to have yet another solution. Based solely on this reason, they can simply reject the paper.
Fig. 5.17
A bad related work section simply lists all those related studies
Instead, the students should revise the related work section by (1) summarizing existing research papers into different groups and (2) pointing out the differences between the proposed solution and each of these groups (see Fig. 5.18). The main reason for categorizing these papers into different groups is that there are possibly many related papers in the field. For example, there are many research papers related to the function approximation approach. If students do not have these groups, reviewers can simply find any existing papers to attack their own research papers (e.g., paper [z] in Fig. 5.17). With these groups, even though reviewers can figure out one missing paper [z], they may not directly give rejection if the group that also covers [z] has been discussed in the related work section (see Fig. 5.18).
Fig. 5.18
A good related work needs to (1) categorize those existing papers into different groups and (2) point out the differences between the proposed solution and each of these groups
5.10 Too Much Background Information in the Introduction
Writing the Introduction section is often challenging for students. A common mistake is a lack of focus: some students spend more than two paragraphs discussing general background information without clearly stating the research problem or motivations. This can leave reviewers confused, as they are forced to read several unfocused paragraphs without understanding the purpose of the work. If reviewers have to work hard to figure out what the manuscript is about, it is likely to be rejected. We have encountered several manuscripts exhibiting this issue. Due to privacy concerns, we present an artificial example below.
Artificial intelligence (AI) has rapidly evolved over the last decade, becoming a major force in shaping industries and scientific discovery. As one of the key components of AI, machine learning has found success in areas such as finance, healthcare, robotics, and education. From personalized recommendations to autonomous vehicles, machine learning is making its presence felt in our daily lives.
Among various machine learning techniques, neural networks have played a pivotal role in driving progress. Their ability to learn hierarchical features from data has made them particularly suitable for tasks involving images, text, and audio. Over the years, researchers have explored different neural network architectures, from shallow models to deep networks with hundreds of layers. Training algorithms, optimization techniques, and regularization methods have also seen significant improvements.
Backpropagation, gradient descent, batch normalization, and attention mechanisms are just a few of the key innovations that have contributed to the success of neural networks. With growing computational power and data availability, these models continue to evolve. In addition, transfer learning and pre-trained models such as BERT and GPT have further expanded the capabilities of neural networks in natural language processing.
In this paper, we explore \(\dots \)
In the poorly written example above, the first three paragraphs talk broadly about AI, ML, and neural networks without narrowing down to a specific topic. Nowhere does it state what problem is being addressed, why it matters, or what previous methods fail to achieve. Why is this research important? Why now? Reviewers will not find any compelling reason to keep reading. Below, we show a rewritten, focused, and well-structured version of the poor example above.
Text classification is a fundamental task in natural language processing (NLP) with applications in sentiment analysis, topic detection, and spam filtering. Recent advances in neural network models, particularly large pre-trained language models such as BERT and GPT, have significantly improved performance on standard benchmarks. However, these models typically require large amounts of labeled data and extensive computational resources for fine-tuning, which limits their applicability in real-world scenarios where data is scarce.
In particular, few-shot text classification–where only a handful of labeled examples are available for each class–remains a challenging problem. While meta-learning and prompt-based methods have shown promise, they often rely on task-specific tuning or suffer from unstable performance across different domains. Moreover, existing models are often over-parameterized for low-data regimes, resulting in inefficiency and poor generalization.
To address these limitations, we propose \(\dots \)
In this rewritten version, the introduction begins by presenting the broader task of text classification and recent advances. It then narrows the focus to few-shot text classification, explaining why this problem is important and challenging, and highlights the inefficiencies and instabilities of existing methods.
5.11 Solutions Do Not Address the Stated Challenges
When approaching a research problem, there are often multiple facets and challenges to consider. Some students attempt to highlight several of these issues in the Introduction section, aiming to demonstrate a comprehensive understanding of the problem space. However, a common mistake arises when the proposed methods or solutions presented later in the manuscript fail to address all the challenges previously outlined. This disconnect can confuse reviewers, who may expect the proposed work to provide solutions to all the identified problems. As a result, the manuscript may appear unfocused or incomplete, increasing the likelihood of rejection. We have observed this issue in several submitted manuscripts. To protect the confidentiality of the authors, we illustrate it with a constructed example below.
Modern recommendation systems face a variety of challenges that limit their performance and applicability in real-world scenarios. These include: (1) Data sparsity, where users interact with only a small fraction of available items; (2) Cold-start problems, especially for new users or new items with no historical data; (3) Dynamic user preferences, which evolve over time and require models to adapt accordingly; and (4) Bias and fairness concerns, where recommendations may amplify popularity bias or marginalize minority users or items. Addressing all these challenges is critical for building robust and trustworthy recommendation systems.
This introduction sets a comprehensive stage, implying that the proposed method tackles a broad set of fundamental and well-known issues in recommendation. However, in the Methodology section, the authors present a graph-based collaborative filtering model that enhances user-item interaction modeling by learning representations over a user-item bipartite graph using graph neural networks (GNNs). The model achieves improved accuracy on standard recommendation benchmarks, but it does not include any mechanisms for: cold-start handling, temporal modeling of user preference dynamics, or bias mitigation.
This creates a clear misalignment between the problem framing and the actual contribution. Reviewers may raise concerns such as: “The introduction outlines several important challenges, but the method only addresses user-item interaction modeling,” or “No empirical evidence is provided to support claims related to fairness, cold-start, or user preference drift.” Such confusion can lead reviewers to question the focus of this paper and interpret the Introduction as overstating the contribution.
5.12 Unclear Motivation for Proposed Methods
When developing and designing methods for a research problem, some students tend to experiment with a few plausible approaches without carefully examining the underlying motivations. As shown in Fig. 5.19, if several methods fail, they may simply consider themselves unlucky. Conversely, if one approach happens to yield good results—often due to favorable data characteristics or tuning—they may quickly settle on that solution and begin writing a manuscript. In many cases, the resulting paper ends up as a straightforward description of the model architecture, followed by implementation steps and experimental outcomes, with little explanation of why the method was designed the way it was.
However, what makes a research paper compelling is not merely the final model, but the thought process that led to it. Readers and reviewers are often most interested in the intellectual journey: Why did you design the method this way? What challenge were you trying to address? What alternatives did you consider and why were they rejected? What core insight guided your decisions? These questions speak to the rationale and motivation behind the method, which are crucial for both scientific clarity and reader engagement.
A manuscript that lacks a clear discussion of the motivation and rationale behind the proposed method often feels shallow or unconvincing. Worse still, without a solid understanding of why the method works, it is difficult to judge whether it will generalize to datasets beyond those used in the paper. This undermines both the scientific value of the work and the confidence of the readers or reviewers in its broader applicability.
Fig. 5.19
Should have clear motivations for proposed methods
5.13 Propose a “New” Method by Naïvely Assembling Existing Techniques
Given a research problem, it is common for students to consider building a solution by assembling existing techniques. This strategy can be a valid way to propose a new method, particularly in areas where the reconfiguration or integration of established components can lead to improved performance, efficiency, or applicability. Indeed, many impactful works in machine learning and related fields are the result of carefully orchestrated combinations of prior methods. However, while assembling existing techniques can lead to novel insights, it also carries certain risks—especially when the assembly lacks innovation or depth.
As illustrated in Fig. 5.20, imagine two contrasting outcomes of the assembly approach. In the first case, the final method resembles a cake—an arrangement of well-known components that does not go beyond what is already well understood in the literature. The method may rely heavily on established modules, apply them in conventional ways, and introduce minimal or no new conceptual contributions. In such cases, reviewers may perceive the work as “a piece of cake”—that is, overly simplistic, obvious, or superficial. The lack of technical novelty or theoretical insight will likely lead to rejection, as the contribution does not meaningfully advance the field.
In the second case, the assembly results in something much more ambitious and integrated—like a rocket. Here, although the individual components may still be based on prior work, their combination is non-trivial. The integration requires careful engineering, deep understanding of each part, and possibly the development of new mechanisms to make them work together effectively. This type of work demonstrates creativity, technical depth, and a clear understanding of the problem space. Reviewers are more likely to recognize the novelty in how the components are combined, the challenges addressed during integration, and the new capabilities that emerge from the system as a whole. Such work is typically viewed as a meaningful contribution to the field.
Therefore, when proposing a new method by assembling existing techniques, it is essential to go beyond simply stacking modules together. One must articulate what is new in the combination, why the integration is necessary and non-trivial, and how it leads to a solution that could not have been achieved by naïvely applying the individual components in isolation. In essence, the difference between a cake and a rocket lies in the depth of thought, the degree of innovation, and the clarity of contribution that the assembly delivers.
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.