Text and Response Module (TRM)

Updated 29 January 2026

TRM is a neural network module that fuses text with user engagement and temporal data to produce unified vector representations.
It segments engagements into time windows, encodes features such as engagement count, elapsed time, user profiles, and doc2vec embeddings, and aggregates them via an LSTM.
Empirical results within the CSI framework show that TRM achieves higher accuracy and parameter efficiency compared to baseline GRU/LSTM models in tasks like fake news detection.

A Text and Response Module (TRM) is a neural network component designed to encode textual information together with patterns of user engagement, producing a vector representation that unifies temporal, user, and semantic features. TRMs are utilized in contexts where the temporal dynamic and social response to content are critical signals for downstream prediction tasks, such as fake news detection, as introduced within the CSI framework (Ruchansky et al., 2017). TRMs systematically fuse structured engagement metadata, user profile embeddings, and distributed representations of text to capture nuanced patterns in how information propagates and is responded to over time.

1. Architectural Composition

A TRM processes a sequence of engagements associated with a document or article. The workflow consists of segmenting user interactions into discrete time windows, encoding per-window features, and aggregating these with a recurrent neural network:

Temporal Partitioning: Engagements (e.g., posts, tweets) linked to a target article $a_j$ are aggregated into $T$ nonempty time windows (e.g., hourly).
Raw Feature Construction: In each window $t$ $t$ , a feature vector $x_t$ $x_{t}$ is assembled:
- $\eta$ : count of engagements during window $t$ .
- $\Delta t$ : elapsed hours since the preceding nonempty window.
- $x_u$ : window-average “user-profile” vector for participants (via truncated SVD, $d_u=10$ or $20$).
- $x_\tau$ : window-average doc2vec embedding for texts posted ( $d_\tau=100$ ).
- The composite feature is $x_t = [\eta, \Delta t, x_u, x_\tau] \in \mathbb{R}^{2 + d_u + d_\tau}$ .
Feature Embedding Layer: $x_t$ is projected through a learned affine+nonlinearity layer:

$\widetilde x_t = \tanh(W_a x_t + b_a),\quad \widetilde x_t \in \mathbb{R}^{d_x},~d_x=100.$

Sequential Encoding: $\{\widetilde x_1, ..., \widetilde x_T\}$ are input to an LSTM with hidden/cell size $d_h=100$ , capturing temporal dependencies and engagement patterns.
Module Output: The final LSTM state $h_T$ produces article-level vector $v_j = \tanh(W_r h_T + b_r) \in \mathbb{R}^{d_v}$ , $d_v=100$ .

2. Fusion of Modalities and Temporal Dynamics

TRMs operate by merging heterogeneous data streams—quantitative engagement metrics, user population embeddings, and contextualized textual features—into a unified representation. Feature fusion occurs via direct concatenation, followed by joint projection and sequential modeling. The LSTM architecture provides capacity to differentiate “bursty” from “steady” temporal activity by interpreting $\eta$ and $\Delta t$ . This design enables the network to learn weights for different modalities automatically, optimizing the integration of social, semantic, and temporal signals (Ruchansky et al., 2017).

3. Training Protocols and Regularization

TRM training is conducted end-to-end within the larger CSI system, utilizing standard supervised frameworks:

Objective: Binary cross-entropy loss over $N$ articles,

$\mathcal{L} = -\frac{1}{N}\sum_{j=1}^N \Big[L_j\log\hat{L}_j + (1-L_j)\log(1-\hat{L}_j)\Big] + \frac{\lambda}{2}\lVert W_u \rVert_2^2,~\lambda=0.01,$

with $L_j\in\{0,1\}$ (ground truth) and $\hat{L}_j$ (classification score).

Optimization: Adam optimizer with learning rate $0.001$.
Regularization: $L_2$ penalty on user embedding weights; dropout ( $p=0.2$ ) applied to projection layers.
Batching: Batches of $32$ articles. The entire TRM, together with associated user-scoring and final classification modules, is trained via backpropagation, allowing the system to optimize representation learning across all components.

4. Integration into Composite Models

Within CSI, TRM-generated vectors $v_j$ are the principal descriptors of article-level patterns. CSI further incorporates a user scoring subsystem (“S-module”), which produces scalar scores $s_i$ for users and embeddings $z_i$ . Aggregated user scores $p_j$ for article $j$ are computed by averaging $s_i$ over all interacting users. The full CSI classification combines article ( $v_j$ ) and user ( $p_j$ ) signals via a final dense layer and sigmoid mapping:

$[v_j; p_j] \rightarrow$ final article classification. A plausible implication is that TRM enables modular integration, making it adaptable for other frameworks requiring article-user pattern fusion.

5. Empirical Validation and Efficiency

Evaluations demonstrate that TRM, as part of CSI, yields substantial predictive gains relative to vanilla recurrent benchmarks:

TS ablation (TRM + naive user averaging) on Twitter and Weibo datasets: accuracy $0.854$/$0.939$, F1 $0.848$/$0.940$.
Full CSI (TRM + S-module): accuracy $0.892$/$0.953$, F1 $0.894$/$0.954$.
Improvement over GRU/LSTM baselines by $4$– $6\%$ absolute accuracy.
CSI/TRM achieves comparable generalization with an order of magnitude fewer parameters ( $\sim52$ K vs. $621$K) and matches baseline performance even with just $10\%$ of labeled data (Ruchansky et al., 2017). This suggests parameter efficiency and data efficiency are key strengths of TRM adoption for sequence-based content-response modeling.

6. Design Rationale and Broader Implications

TRMs formalize the representation of article-level engagement history by jointly encoding multiple feature modalities, with emphasis on temporal structure and distributed semantics. The use of doc2vec and SVD-defined user profiles leverages unsupervised pretraining for robust feature extraction. LSTM recurrence provides flexibility for different temporal patterns, and the dense output vector $v_j$ is suitable for downstream classification, integration, or retrieval tasks. A plausible implication is that such modules can be generalized to other domains involving intertwined text and social feedback, such as rumor propagation analysis, topic diffusion, and collaborative filtering.

Markdown Report Issue Upgrade to Chat

References (1)

CSI: A Hybrid Deep Model for Fake News Detection (2017)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Text and Response Module (TRM).